Searchable C/C++ DB surpasses 275 million lines

Some statistics to get you started by Anonymous Coward · 2005-12-05 05:28 · Score: 5, Funny

I'm currently looking for suggestions on what sort of 'interesting statistics' I could create from 275+ million lines of open source C/C++ code.

The following "interesting statistics" come to mind:

Percentage of functions named "deepThroat" (0%)
Number of comments mentioning a "girlfriend" (11) or "wife" (29) to "Natalie Portman" (41)
How many variables named "penis" are of type "long" versus type "short" (unknowable!)

You gotta get the variables searchable. Most critical for that last statistic. Also, I'm too lazy to learn Lucene Query Parser Syntax, so the statistics for "Natalie Portman" may include references to "portman."

useful statistic by kunzy · 2005-12-05 05:30 · Score: 5, Funny

the time from the frontpage acticle on /. to the death of your server?

Re:useful statistic by Sembiance · 2005-12-05 05:33 · Score: 5, Funny

Well, it's been about 2 minutes on slashdot... my site is already dead. So uhm... 2 minutes?
Re:useful statistic by Baricom · 2005-12-05 06:11 · Score: 4, Funny

So uhm... 2 minutes?

Sounds like you should have written it in C++ instead of a laggard language like PHP ;).

My vote is for... by Anonymous Coward · 2005-12-05 05:31 · Score: 5, Insightful

How many lines consist of:
}

Re:My vote is for... by Anonymous Coward · 2005-12-05 05:43 · Score: 2, Funny

Probably about as many lines consist of: {
Re:My vote is for... by epiphani · 2005-12-05 05:43 · Score: 4, Interesting

Same type of thing, but indenting styles. K&R vs. BSD, ect. I'm curious how that breaks up.

(Partial to BSD style myself..)

--
.
Re:My vote is for... by mebollocks · 2005-12-05 06:33 · Score: 5, Funny

I dunno, maybe you could find the algorithm on the net somewhere? ...if only there was some kinda searchable code database of some sort...
Re:My vote is for... by Triple+Click · 2005-12-05 06:46 · Score: 2, Insightful

Depends whether you do this:

if (cond) {
}

or this:

if (cond)
{

}
Re:My vote is for... by Anonymous Coward · 2005-12-05 07:36 · Score: 2, Informative

K&R!!
ONLY K&R!!!!

Seriously, I am a K&R maniac, which caused me to get quite irritated at one of my professors, who once wrote "confusing braces" on a programming assignment I handed in. (It was a little confusing, but because I was being clever and efficient, not because of my braces preferences.)
I think the proportion of code written in K&R vs. The Incorrect Styles would be very interesting to see.
Re:My vote is for... by baadger · 2005-12-05 07:57 · Score: 5, Interesting
Theres an idea right there, how about some stats showing popularity of various coding conventions?
- Variables: under_score vs. camelCase
- Tabs vs. spaces
- "if (cond) {" vs. "if (cond)\n{"
- How many coders bother enclosing single conditionally executed statements with {}
- How many coders bother producing comments directly before or after function definitions, describing function implementation?
- Lines of comments to lines of code ratios
- Number of functions to lines of code ratios for various projects?
- Number of projects making use of global variables?
- C, to C++, to C# (if your engine covers it) project ratio
etc

Similarity checking by roguerez · 2005-12-05 05:31 · Score: 5, Funny

Find similarities with stuff like SCO.

Interesting stats by sparkes · 2005-12-05 05:32 · Score: 4, Interesting

How many lines contain expletives?

--
blog and junk

Re:Interesting stats by moosesocks · 2005-12-05 07:05 · Score: 4, Informative

How many lines contain expletives?

for your reading pleasure.... the linux kernel fuck count

--
-- If you try to fail and succeed, which have you done? - Uli's moose

SCO by cmburns69 · 2005-12-05 05:32 · Score: 2, Funny

With all that code indexed, maybe we'll finally be able to figure out what the heck SCO's talking about.

But then again, probably not...

--
Online Starcraft RPG? At
Dietary fiber is like asynchronous IO-- Non-blocking!

Statistics: by duckpoopy · 2005-12-05 05:32 · Score: 5, Interesting

1. Lines per function
2. Comment / command ratio
3. Number of curse word variable names

--
word.

Re:Statistics: by gronofer · 2005-12-05 05:38 · Score: 2, Insightful

4. The number of times the wheel has been reinvented.
Re:Statistics: by Anonymous Coward · 2005-12-05 05:40 · Score: 3, Informative

From the stats page if you cannot get to it...

Overall Stats
Number of Packages: 10,931
Total Number of Files: 1,151,819
Total Lines of Code (No comments, no blank lines): 283,119,081
Total of All Lines: 420,355,464
Total Number of Functions: 7,782,468
Total Number of Functions Called: 69,500,700
Total Number of Macros: 9,947,564
Total Number of Classes: 209,361
Total Number of Comments: 38,125,107
Total Number of Structures: 554,178
Total Number of Unions: 19,687
Total Number of Includes: 5,904,187
Re:Statistics: by maxwell+demon · 2005-12-05 06:30 · Score: 2, Funny

Total Number of Functions: 7,782,468
Total Number of Functions Called: 69,500,700

So the code calls 61,718,232 functions which don't even exist?

But maybe they just meant "Total Number of Function Calls" :-)

--
The Tao of math: The numbers you can count are not the real numbers.
Re:Statistics: by Sembiance · 2005-12-05 08:07 · Score: 2, Informative

You can see the license type broken down here:

http://csourcesearch.net/license/

You can also click on any of those licenses and then on that page choose to only search for code found in that license.

ratio by FreeBSDbigot · 2005-12-05 05:33 · Score: 5, Funny

... of "foo" to "bar."

--
Orange whip? Orange whip? Three orange whips.

Re:ratio by ahem · 2005-12-05 07:24 · Score: 4, Funny

From google:

Search -- foo -> Results 1 - 10 of about 26,600,000 for foo. (0.06 seconds)
Search -- bar -> Results 1 - 10 of about 385,000,000 for bar [definition]. (0.16 seconds)
Search -- foo bar -> Results 1 - 10 of about 7,900,000 for foo bar. (0.12 seconds)

'bar' wins. This intuitively makes sense, as who would want to go to the 'foo' for a drink, or eat an 'energy foo'? Could you imagine a lawyer being 'dis-fooed'?

--
Not A Sig

Suggestion by lbmouse · 2005-12-05 05:33 · Score: 5, Funny

"I'm currently looking for suggestions..."

How about a new server?

Slashdot Block by Yerase · 2005-12-05 05:34 · Score: 3, Interesting

I love the GeShi page, how it blocks everything from Slashdot. Setup a site to advertise a product, then restrict people from using it....

URLs on this server linked by slashdot.org will be refused. Permission is given to slashdot to mirror content as necessary for the purpose of providing its users access to the information on the site. Slashdot should not attempt to bypass the referer block. Use of the google cache page for the site is acceptable as long as the page(s) concerned have no more than 1 image.

Re:Slashdot Block by lowrydr310 · 2005-12-05 05:41 · Score: 2, Insightful

This policy is employed for the sole purpose of avoiding a huge bandwidth bill that I would have to pay out of my own pocket. Anyone who would like this restriction to go away is more than welcome to send me bucketloads of cash.
If you don't want to pay a big bandwidth bill then don't run a webserver.
Re:Slashdot Block by wampus · 2005-12-05 05:45 · Score: 2, Interesting

Thats why I use Cacheout. Its a Firefox extension that adds a context menu item to coralize any link. Bypass the restriction AND not kill the site, all at the same time.
Re:Slashdot Block by b4k3d+b34nz · 2005-12-05 05:53 · Score: 2, Insightful

Why would anybody WANT to pay a big bandwidth bill? It's called being smart so that he doesn't get the shaft when he has to pay his utilities this month.

--
Grammar Lesson: you're is a contraction of "you are"; your means you possess something; yore means days gone by.
Re:Slashdot Block by gstoddart · 2005-12-05 06:20 · Score: 2, Insightful

This policy is employed for the sole purpose of avoiding a huge bandwidth bill that I would have to pay out of my own pocket. Anyone who would like this restriction to go away is more than welcome to send me bucketloads of cash.

If you don't want to pay a big bandwidth bill then don't run a webserver.

That's a little harsh don't you think?

It's one thing to run a site and have reasonable expectations of having "enough" bandwidth for your projected traffic, and it's another thing to pay for a slashdotting on an ongoing basis.

This person has decided they don't really want to be linked from Slashdot.

It's hardly an all-or-nothing thing ... for my personal web-site, the several gigs of traffic I'm allowed per month are more than adequate. But I'm sure as hell not going to pay extra to have enough on the off-beat chance that everyone in the world suddenly wants to see my site.

--
Lost at C:>. Found at C.
Re:Slashdot Block by Kjella · 2005-12-05 06:25 · Score: 2, Informative

If you don't want to pay a big bandwidth bill then don't run a webserver.

For every problem, there is a solution that is simple, elegant and wrong. In every other market, the more demand there is, the higher the price/revenue/profit. Web servers are pretty much the only place where you lose more money the more popular you are (e-commerce sites and such not included). If so many people want the content, they can find a way to share it. Even then they're getting a bloody good deal, if you ask me. What exactly are you complaining about, that they aren't generous *enough*? Blocking slashdottings is a small price to pay compared to turning it into a [ad] pile [ad] of [ad] advertisements or subscription site. That is what you do if you "don't want a big bandwidth bill".

--
Live today, because you never know what tomorrow brings

Choice of db? by Anonymous Coward · 2005-12-05 05:35 · Score: 4, Interesting

So, this is not a flame, but I'm curious about your choice of dbs.
I've used mysql for some small projects, but generally it does handle
millions of rows (although the upper limit on rows can be patched with
some additional behaviors). So, for big dbs, I use postgresql.

How did you decide to use mysql? (Was it that the project started,
and grew, or did you know it would handle large numbers of rows
from the start)?

Just curious. This is probably going to be viewed as a flame by many
(particularly those who don't really use dbs very much, but use them
enough to have strong opinions).

Re:Choice of db? by Sembiance · 2005-12-05 08:18 · Score: 4, Informative

I've used MySQL in the past for some projects at work, where the number of rows were several hundred million and ran with no problems so I knew it was capable of large row numbers.

I initially used their FULLTEXT indexing as well, but it dies a horrible death with a large number of rows or search terms. (The developers that live in #mysql on Freenode confirmed this)

So I had to hand off searching to Lucene, which worried me a great deal (being java) but as folks tell me 'Java is not slow'.
They are right, Java is very fast at handling the searching and I've been very impressed.
Most searches in the Java database only take one or two seconds.
The MySQL query/join for additional info take another 4 or 5 seconds.

Most searches take about 8 seconds to come up, even under no load.

I simply don't have enough RAM to keep the necessary MySQL indexes in RAM and use index only queries.

Statistics TM (c) by chunews · 2005-12-05 05:38 · Score: 5, Interesting

It would be interesting to see the number of different copyright notices contained within all that source code, and then to present the notices in groups, like GPL GPL2, etc..

Also, I would really like to find "patient 0" for sourcecode. For example, is there a common library or utility function (perhaps Hex2Ascii?) that *everybody* uses? Well, who wrote it first?

And in a similar vein, who are the "top 5-10-100" authors of open source code by use, reuse, KLOC, etc.. Not of too much use unless I were awarding the Nobel prize for programming, or perhaps creating a list of individuals for the RIAA to sue, after their done with their other useless lawsuits. :)

Interesting Statistics by iso-cop · 2005-12-05 05:39 · Score: 5, Interesting

In the software engineering world, people will be interested in all sorts of code metrics such as cyclomatic complexity, operator/operand counts, lines of code per module, and such as well as object oriented metrics for the C++ code (depth of inheritance, for example). If you can marry these sorts of metrics with defect data (bugs) for each of the modules then you have a useful data repository for predicting defects in source code. Keeping around different versions of modules changed is also valuable here. If you can gather information on how long it took to produce the module and how long it took to correct defects in the module you are getting even better. If you make it easy to reuse the C and C++ modules...even better.

Re:And then... by Sembiance · 2005-12-05 05:39 · Score: 5, Interesting

Advertise? No, I'm just a single coder doing this for fun and hope that some people will find it useful.

Amazon style statistics by tod_miller · 2005-12-05 05:39 · Score: 4, Interesting

I was very impressed with Amazon, who for each book say which phrases and words were particularly unique to that book. (reminds me of that google game where try try and get any two words with only 1 hit).

So show code with coloured background to the lines, from green to red, green being 'normal every day boiler plate' code, red would mean this code must be more specialised, or written by some half-wit l33t h4x0r at least.

I forgot what they called it, but they had 3/4 visible stats based on the semantics of the stuff, probably more under the 'hood (omg lol).

word. Oh some adhesion stats would rock!

please type the word in this image: adhesion
random letters - if you are visually impaired, please email us at pater@slashdot.org

--
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com

Re:And then... by guaigean · 2005-12-05 05:42 · Score: 2, Funny

My apologies then. As a regular Slashdotter it is forbidden for me to RTFA.

--
Microsoft Sucks, F/OSS Rocks. I get mod points now right?

The basics and more by PetriBORG · 2005-12-05 05:44 · Score: 2, Insightful

Start with the basics, and then move on..

Whitespace to code ratio
Counts for each of the dirty 7
Line counts that just contained () or {} or []
A list of projects the code is from
And then more interestingly, I'd like to run some sort of program on it to find similarities in code, to see how much one code base overlaps with another. It would be interesting to see if OSS actually does share code between projects or if its all NIH (not invented here).

--
Pete/Petri "damn, my chainsaw is clogged with 1's and 0's again." --clyde

Hit Refresh by everphilski · 2005-12-05 05:44 · Score: 4, Informative

Just hit refresh and the webserver won't get the HTTP_REFERRER (granted you'll have to manually delete the text file he serves you)

-everphilski-

Re:Hit Refresh by sglane81 · 2005-12-05 06:03 · Score: 2, Funny

Actually, if you click refresh on a page from a link, it will resend the referrer as well. Most browsers do this. One more thing, you spelled HTTP_REFERRER correctly, which is wrong :) It's spelled HTTP_REFERER, only has one R. Reverse grammar nazi FTW?

--
This is the Internet. You can say "fuck" here. - AC

interesting stat by bsdluvr · 2005-12-05 05:45 · Score: 3, Funny

1) randomly select 2000 lines of code
2) compile
3) execute
4) ???????
5) PROFIT!

Woman by chris_mahan · 2005-12-05 05:46 · Score: 2, Funny

I'd like to know whether the word "woman" appears anywhere, and if so, in what projects.

Eh.

--

"Piter, too, is dead."

Measurements I have made by derek_farn · 2005-12-05 05:47 · Score: 4, Insightful

Source code usage measurements contain many surprises (ie, developers don't always write what people think they do). Some statistics I have collected, on a smaller code base, are available here. The source of the tools used to exract much of the data (at least for those tables and figure I produced) is available here (C only at the moment).

Being able to search so much source is also very useful. I was involved in a discussion a while back about the frequency of use of bessel functions in programs (I claimed rare). The handful of uses returned from your database helped back up my argument (dare I say prove it).

Keep up the good work!

Sounds kind of like the PMD scoreboard... by tcopeland · 2005-12-05 05:48 · Score: 4, Interesting

...that is, a static analysis of a bunch of Java SourceForge projects. It does unused code and duplicate code detection... sometimes it finds some interesting things.

PMD home page is here, book site is here.

--
The Army reading list

Re:What? Millions of code? by tgd · 2005-12-05 05:49 · Score: 4, Informative

Its a searchable database OF code from other products, containing 275 million lines you can search across.

Its not a searchable database written in 275 million lines of code.

Please check for this: comma in brackets in C++ by Animats · 2005-12-05 05:58 · Score: 5, Interesting

C++, for historical reasons dating back to C, has wierd semantics for commas in brackets. The operator precedence for commas is different inside of "()" and "[]".

So tab(i,j) is a function call with two arguments. But tab[i,j] is an invocation of the "comma operator", then a function call with one argument. The default "comma operator" ignores the first argument and returns the second. It once had some uses in C macros.

I've argued with the C++ committee about this. If "operator[]" had the same syntax as "operator()", we could have support for multidimensional arrays in C++. But there's a concern that somewhere, someone might have code that depends on the current semantics of the comma operator inside square brackets.

This new archive offers the opportunity to eliminate that possibility. So, do this search: Find, in non-comment standard C++ code, any occurences of a comma operator within square brackets. Eliminate any where there are parentheses within the square brackets enclosing the comma. Can you find any? In any production code? In any open-source project? Anywhere?

Re:Please check for this: comma in brackets in C++ by Vorondil28 · 2005-12-05 06:17 · Score: 3, Insightful

I've argued with the C++ committee about this. If "operator[]" had the same syntax as "operator()", we could have support for multidimensional arrays in C++.

I'm no C++ expert, but isn't int array[row][col] a multidimensional array?

--
This sig rocks the casbah.
Re:Please check for this: comma in brackets in C++ by chris+macura · 2005-12-05 06:29 · Score: 4, Informative

Yes, they are. But from an OOP standpoint, it's impossible to create a datastructure that "knows" you're using the [] operator twice. So if you overload the [] operator in an array structure, to get multi-dimensional arrays, you have to nest single dimensions arrays, which is almost always inefficient because the rows (or columns, depending on whether you're row major, or column major) are lying around the RAM (depending on where they were allocated) , rather than a continous chunk like with C. In other words, you can't do something like this in C++: class SmartArray { public: SmartArray(int height, int width); int operator(const int &x, const int &y) const; // ... }; ... SmartArray a(5, 5); a[12, 13];
Re:Please check for this: comma in brackets in C++ by milgr · 2005-12-05 06:38 · Score: 2, Informative

The grandparent got it correct. C does support multidimensional arrays. I suspect that C++ does too.
To validate, I pulled out my copy of K&R 2nd edition (Actually a copy I once rescued from a trash bin, and my copy is only "Based on Draft-Proposed ANSI C"). In section 5.9 Pointers vs. Multi-dimensional Arrays it points out,
Newcomers to C are sometimes confused about the difference between a two-dimensional array and an array of pointers, such as name in the example above. Given the definitions
int a[10][20]; int *b[10];
then a[3][4] and b[3][4] are both syntatctically legal references to a single int. But a is a true two-dimensional array: 200 int-sided locations have been set aside, and the conventional rectangular subscript calculation 20xrow+col is used to find the element a[row,col]. For b, however the definition only allocates 10 pointers and does not initialize them; initialization must be done explicitly, either statically or with code.

--
Where law ends, tyranny begins -- William Pitt
Re:Please check for this: comma in brackets in C++ by The+boojum · 2005-12-05 07:19 · Score: 3, Interesting

I was just going to point this out. I even hacked up a simple example to show it:
struct location { int dimension, coordinates[ 20 ]; location( int first_coordinate ) : dimension( 1 ) { coordinates[ 0 ] = first_coordinate; } location &operator,( int const right ) { coordinates[ dimension++ ] = right; return *this; } }; struct array { int matrix[ 100 ][ 100 ]; int &operator[]( location const &right ) { return matrix[ right.coordinates[ 1 ] ][ right.coordinates[ 0 ] ]; } }; int main( int argc, char **argv ) { array blah; blah[ 5, 5 ] = 10; }
Proof of concept and it doesn't really do anything, but it compiles just fine. I don't see a problem here. A real implementation would probably do some clever stuff so that the optimizer can optimize away the intermediate data structure.
Re:Please check for this: comma in brackets in C++ by hikerhat · 2005-12-05 07:40 · Score: 2, Funny

Well, the obscureness of the comma operator is used by C++ recruiters who thinks they are really "clever", and in "clever" C/C++ puzzles on usenet. If you took it away, how would you hire C++ programmers and how would you have fun on usenet?
Also, C++ programmers are getting really old, and they don't handle change very well.
Re:Please check for this: comma in brackets in C++ by Old+Wolf · 2005-12-05 08:09 · Score: 3, Insightful

You can do exactly that -- just write a(12,13) instead of a[12,13].
This is a great counterexample to the GP. Changing the meaning
of the comma within square brackets would gain NOTHING and would
mean every existing compiler is now wrong.

The existing C array type is bad enough as it is, why make it
even more unwieldy by introducing a new variant? C++ is already
on the right track: discourage C arrays, and encourage container
classes that have things like bounds checking and automatic
memory allocation.
Re:Please check for this: comma in brackets in C++ by chris+macura · 2005-12-05 10:07 · Score: 2, Insightful

That's the whole point of the complaint. Inconsistentcy between [] and ().

best_idea_ever by l33t-gu3lph1t3 · 2005-12-05 05:58 · Score: 3, Insightful

charge for a premium service that allows Computer Science and Software Engineering profs to perform a somewhat intelligent search of the code to see just how much of their students' code is lifted off the 'net ;)

--
------- "From bored to fanboy in 3.8 asian girls" ----------

See also: Codase.com by kriegsman · 2005-12-05 06:00 · Score: 2, Informative

See also Codase.com, another "Source Code Search Engine", which lets you search by method names, class names, variable names, free text, etc..

-Mark

Koders.com by knipknap · 2005-12-05 06:01 · Score: 2, Informative

Don't know, koders.com supports a lot more languages and also lets you narrow your search to specific licenses. The few extra lines of code just don't seem too do it, especially because such measures highly depend on the chosen method.

Re:Wtf? by Digital+Vomit · 2005-12-05 06:02 · Score: 2, Insightful

What better reason than to create such a program other than "why not"?

A person who is a true programmer in his soul doesn't ask himself "why". Oftentimes the sheer joy of creating something from nothing is enough.

--
Modern copyright is theft of culture from everyone and it retards the progress of the useful arts and sciences.

How about a potential buffer overflow index? by raddan · 2005-12-05 06:07 · Score: 4, Informative

You can start by seeing how often people use gets(), strcpy(), strcat(), etc... Look for all the fun little common mistakes that people make.

stats we'd like to see... by digitaldc · 2005-12-05 06:08 · Score: 4, Funny

-# of non-numerical constants
-# of ( ),{ },\ /,#,; characters in code
-time spent debugging/compiling
-total hours spent in production
-gallons of coffee consumed
-hours of daylight seen
-# of relationships destroyed

--
He who knows best knows how little he knows. - Thomas Jefferson

Code Styles by ionrock · 2005-12-05 06:09 · Score: 5, Interesting

I would love to see if different code styles could be analyzed to see how many peopel use what sort of syntax style. There is camelCase and under_scores but it seems possible to find more complicated trends that might allow reviews to statistically determine what practices really help to make code better.

Need to watch those stats by Quiet_Desperation · 2005-12-05 06:09 · Score: 2, Funny

For example, "Lines of code" / "Lines of commenting" will always produce "Inf"

histogram of C reserved words by jab · 2005-12-05 06:14 · Score: 5, Interesting

I'd love to see how one of my programs (stats below) compares to the, uh, national average. 1222 if 638 return 482 static 413 for 399 int 217 const 201 else 194 void 128 char 115 case 112 break 55 default 43 sizeof 37 do 35 switch 27 enum 24 struct 23 while 15 float 14 typedef 10 auto 7 unsigned 6 extern 1 long

Re:histogram of C reserved words by plabtfall · 2005-12-05 07:25 · Score: 5, Funny

Yeah, me too: 2431 int 1802 goto

Re:Size doesn't matter by kmartshopper · 2005-12-05 06:17 · Score: 3, Funny

It's the quality of the search results that counts.

Yeah, keep telling yourself that...

or "// FIXME" by StandardDeviant · 2005-12-05 06:20 · Score: 4, Funny

(subject says it all ;))

--

News for Geeks in Austin, TX

Don't mess around, learn from NLP folks by Xofer+D · 2005-12-05 06:37 · Score: 4, Insightful

This is a good opportunity to build complex statistics about the C++ grammar actually used in context. Learn from the NLP people! Parse the whole thing, and start finding common subtrees in the grammar used. Look at common lexical entries between subtrees, so we can make a tool that can help recognize errors by comparing against commonly used C++ grammar fragments. Or do function completion based on what kind of function you look like you're writing. See if you can do alignment with similar languages and do statistical source translation. If you keep information about comments used (and maybe apply some real NLP), you might even have a shot at automatically classifying functions based on their form, and documenting them with simple comments.

If that's too hard, try finding all n-grams instead, at least under some length. That's a lot more useful than just individual tokens or strings.

With a lot of data, you can do very cool things. Don't mess around with string frequency counting. C++ is simple compared to English, do something interesting.

--
The Signal/Noise ratio can be improved in two ways. Remaining silent is the OTHER way.

TODOs by mrshoe · 2005-12-05 06:52 · Score: 2, Interesting

Counting the number of "TODO"s and "XXX"s in "production" open source code could be interesting.

--
There are two types of people in this world: those that categorize other people and those that don't.

Re:275+ million lines by gstoddart · 2005-12-05 07:11 · Score: 2, Funny

How about the % of them that would work on a lady in a bar? line 53256 "Hey pretty lady, are you an astronaut because your ass looks out of this world" ....oh....not those kinds of lines....*sigh* and I thought I was so close

No, no, no.

You do not use lines 1..N on the same lady until it works. It's not like breaking encryption -- you don't get to try all the possible keys.

I have friends who have done this, and they swear it's a percentage game. Choose one line you like, and try it on women 1..N until it does work, or you get tired of getting told to sod off. Apparently, with the right combination of variables, any line can be verified to work under some circumstances.

Truthfully, I don't know how anyone can set out with the knowledge they're going to get told to drop dead 70-100 times/night, but I guess if you can live with that kind of failure rate on an ongoing basis, you'll eventually get the success rate you wanted.

Now go forth young geek, and attempt to multiply. ;-)

--
Lost at C:>. Found at C.

Re:And then... by Anonymous Coward · 2005-12-05 07:44 · Score: 2, Funny

I'm just a single coder

-1, Redundant

This is Slashdot, of course we're all single.

HACK, TODO, BUG & FIXME by Pete+Brubaker · 2005-12-05 07:52 · Score: 2, Interesting

I recently did a search on some of our codebase here at work to see how many times the above keywords remained in shipping code. I was a little surprised to see how many cases there were in our code. I think sometimes, maybe even most of the time we as programmers over use these words.

Pete

--
What's a sig? Pete Brubaker

Proposed workaround doesn't work by Animats · 2005-12-05 08:31 · Score: 3, Informative

Yes, that compiles and runs, but it doesn't do what you think it does. Put in some debug print to see what's actually happening, which is this:

"5,5" is evaluated using the built-in definition of ",", returning "5". The no-conversion built-in operator comma has higher priority than the conversion sequence involving a conversion to "location", then the use of the overloaded comma operator. So the built-in comma operator is used. See the discussion in the C++ ARM, section 13.2, "Argument matching": which says "consider an exact match better than any conversion".
"5" is converted to type "location" by the constructor for "location", resulting in a "location" object with "dimension=1" and "coordinates[0]=5".
This "location" object is passed to "operator[]", which then accesses "coordinates[1]", an uninitialized value, which it then uses as a subscript, returning a reference to a arbitrary memory location. So, instead of returning "&blah.matrix[5][5]", it returns "&blah.matrix[???][5]". The example program seems to run in VC++ only because that part of memory happens to be 0 at startup, so this returns "&blah.matrix[0][5]". In other circumstances, it might cause a crash.
"10" is stored into the wrong location of "blah",or outside it, due to the bad reference generated above.. This is where the buffer overflow occurs.

You can force the conversion with

blah[ location(5), 5] = 10;

but that's not useful except to see what's happening.

You can't overload the built-in operators for built-in types. So overloading, outside of an object, "operator,(int, int)" won't work either.

Hence the need for a straightforward solution.

Re:Yet another source code search engine? by Sembiance · 2005-12-05 08:36 · Score: 2, Interesting

I just did it for fun, and hopefully some people might get some use out of it.

This engine understands the code at a C/C++ syntax level, unlike koders.com so you can better search for what your after (comments, functions, macros, classes, etc).

Also this engine DOES allow you to click on words in the code, but only includes and function or macro calls.

There are several things that are not that great about my site, it's a little slow, doesn't support free text searching nor variable searching, and you can't copy search URL's for pasting (uses XMLHttp and form POST's).

But it's just me doing this thing, and I have limited time and most importantly limited money/hardware.

My wish is for google to do their own but index a LOT more code and have it be fast and friendly :)

They certainly have the resources to do it and would be a great tool for coders to use. Maybe this will help fill a gap in the mean time :)

Re:useful statistic: parent: -1 troll by Baricom · 2005-12-05 09:12 · Score: 3, Funny

That "woosh" sound you hear is the wink emoticon zooming over your head, joke in tow.

I know PHP is a great web language and that it probably isn't the cause of the slowdown. Heck, even Yahoo! uses it these days.

I was attempting (unsuccessfully, it seems) to make fun of the purists who insist that robust web applications must run on something compiled in order to reach acceptable performance under high load.

Re:histogram of C reserved words - well, B .... by ignavus · 2005-12-05 10:36 · Score: 2, Informative

auto is a throwback to B days (the language immediately before C). B had no data types (no int, float, double, etc) but did have storage types: auto, static, and extrn.

auto was necessary in B for local variables, as a plain variable name by itself was a valid expression statement (as it is in C), not a declaration (IIRC).

1. foo() { auto bar; ... }
2. foo() { static bar; ... }
3. foo() { extrn bar; ... }
4. foo() { bar; ... }

All mean something different in B: the first three instances of bar are declarations, the fourth is an expression statement (and if I remember my B correctly, it is invalid as the first statement of foo(), because bar hasn't been declared one of auto, static, or extrn yet in this function).

In C, auto is completely redundant. Except, perhaps, in comments.

Ah, B. The days when programmers were programmers and data was data, and you could perform any operation you liked on any variable. Want to divide a pointer to a string by 3? Go ahead. Self-disciplined programmers don't need training wheels. Just a choice between auto, static and extrn.

--
I am anarch of all I survey.

Slashdot Mirror

Searchable C/C++ DB surpasses 275 million lines

73 of 328 comments (clear)