> Exactly how does a gay couple getting married affect you negatively?
Specifically, it will make me give them money. If your spouse dies in federal service, you are eligible for money. From taxes. Which I pay.
But that is really not the issue. The problem with gay marriage is that it grants a moral sanction to homosexuality. When made by an individual, I simply shrug, since his opinion has no value to me. When made by the government, which is supposed to be representing the interests and morals of all the people, it is reprehensible. It is the same reason why the government does not allow you to do drugs; so that if you do it, you do not have delusions that society approves of your actions. A drug user down the street does not hurt me in any way and I couldn't care less what he does in his living room. If he goes out driving and kills somebody, he goes to jail and loses his license; and that's fair. But I certainly don't want him to think that I approve of what he does. Just as I would not approve of him molesting children, having sex with cows, performing satanic rituals, torturing animals, or having sex with other men. If he wants to do it anyway, let him do it cowering and hiding, in shame and regret. Then maybe he will eventually stop. Until he does, I certainly want nothing to do with him. If I give a party, I will not invite him; if I have a job opening, I will not give it to him; if I run a store, I will sell him any goods. Of course, the government will forbid me from doing all that. And since they have the guns, I will comply. But nobody can make me like it, and that is the irony. The society still disapproves and the transgressors will know it.
>> If you keep advocating legalisation of gay marriage, >> God will most certainly want nothing to do with you. > > You talk to him do you?
I don't have to. He certainly has made his point throughout the Bible, and I can't think of a more convincing proof of God's opinion than destruction of Sodom and Gomorrah for homosexuality, among other equally disguisting things. Of course, being an atheist, I only read about it out of curiosity. My views on sodomy have nothing to do with God.
> if he does not want to see interracial marriage > become an accepted practice, surely it is the > right thing to do. Right?
Of course it is. Everyone is entitled to a vote in this country, and if he wants to vote against interracial marriage, it's his choice.
> People used to justify stomping on the civil rights of > black people just like they currently are doing to gay people.
You forget that the rights you currently possess have been granted to you by society. You do not have a right to steal, kill people, practice poligamy, snort cocaine, spread child pornography, and other crimes. Your city may choose to grant you other rights, like specify what you can and can not do with your property. The city might require that you mow your front lawn on a regular basis, clean the sidewalk, keep your house paint looking good, provide access to wheelchaired people, etc. The "civil rights" you now take for granted were likewise granted by society's approval. Giving them to blacks was also granted in like manner, by recognizing them as human beings and thus equal to the white race before the law. But these rights do not just come out of thin are; they are a representation of the moral code of our society and are not unchanging. They can come and go and they will, for blacks, for whites, and for gays.
> Your kids are not going to turn gay just because they see gay people.
Nobody is afraid that their kids will turn gay by looking at other gays. However, they might not want their kids to associate with gays, just as they restrict their kids' social activities with other types of undesirable people.
> don't know about you, but it wouldn't have mattered > how many gay men I saw kiss, boobies would still have given me a woody.
The issue is not whether you would turn gay, but whether you want to see gay men kiss in the first place. Just as we do not tolerate other disguisting habits some people have, so gay behaviour is definitely not socially approved.
> People are so afraid of turning gay - makes me wonder if they're *already* gay.
I don't know why I keep seeing this accusation. When I say I dislike drug users, nobody says "well, it's just because you're afraid of becoming one." Gosh darn it. That must be it. I'm so afraid I might start using drugs, I want to keep away from all those loony druggies.
> The constitution was writen so that people would > not be abused by the government, ANY PEOPLE
So what about people who are disgusted by homosexuality? Why should they be forced by the government to acknowledge gay couples as "married"? Marriage is a an important concept in our society as well as in the legal system. A spouse gains many legal and social rights that are not available to even close friends. Why should anyone be forced to grant those rights to people in a relationship of such a perverted nature?
> I would certaintly say that homosexuals qualify as a minority group.
Then so do those who torture squirrels in their back yards, lust after eleven year old girls, or enjoy eating their own excrement. Because, you know what? That's what those people you call "homophobic" really think of gays. It's not a fear, it's revulsion. It's not hate, it's avoidance. And just as people avoid child molesters, drug users, sadists, and other perverts, they want to avoid gays. Hey, they might even want the word "gay" back to its original meaning! Why should you let the government force gays and their ways upon the recoiling society?
> Just because its a democracy doesn't mean that > the whatever the majority says goes.
Sorry. Democracy is majority rule by definition. If you want to be special, it is not a democracy any more. It's an aristocracy. Ours is an aristocracy of pull - if you can contribute enough money to your friends in Washington, you can do pretty much anything you want.
> And god help us if it ever does start working like that.
If you keep advocating legalisation of gay marriage, God will most certainly want nothing to do with you.
> that won because he openly advocated limiting > civil rights of an etnic group, and used it to divide the country.
People want to have a president who shares their values in order to have the kind of a country they want to live in. In case you haven't noticed, people in 11 states decided that gay marriage was unacceptable. I suspect that majorities in other states hold similar views, and if the majority wants something, in a true democracy they would get it, since that is the definition of democracy.
> It sickens me to think that people who never voted > before said "Whoa, nothing else has mattered to me in > the last 20 years, but the QUEERS WANT TO GET MARRIED! > Jarlene, find me my votin' hat!"
When you see the moral standards of your society being destroyed, what good man would not act, if the decline could be stopped, or at least slowed by simply showing up and voting NO? If he does not want homosexuality to become an accepted practice, surely it is the right thing to do.
> I can look anywhere I like for counter examples, > because you said always. Thats what the arguments > about, remember?
Do you remember?:) I said that an array was always better than a linked list. I did not say that you should use arrays instead of all other data structures. Of course you would use classes for record data, trees for databases, and may be even *gasp* unions for some overlapping data. My point is simply that linked lists are an inferior construct and should always be replaced with something better, even if it is only a linked list of arrays for paging your element additions. As I said in another post, it makes me extremely uneasy to allocate memory for a single element, no matter what, so even when I use linked structures, they are paged linked structures like the B-tree. "Many at once" philosophy will always do better than "One at a time".
> list.cpp, inserting elements into the middle of a > linked list using the STL list template:
You are not doing the same thing. I did not intend to simulate insertion exactly in the middle. If my insertion point remained the same, I could simply preallocate a hole in the vector and copy the new elements into it. So no, it does not blow a hole in my theory.
> Given an array of 1000000 large elements, how > would you insert an new element in the middle?
I have written a sample program to test insertion in the middle of an array (paged allocation and copy to shift) as compared to the same operation on a linked list. Arrays win by a factor of 40: here
People have complained to me that arrays are not efficient for inserting an element in the middle. I was skeptical, because the cache misses ought to slow things down. So here's a small program to test my hypothesis (you'll need http://sourceforge.net/projects/ustl):
#include <ustl.h> using namespace ustl;
int main (void) { #if USE_ARRAY
vector<int> v;
for (uoff_t i = 0; i < 20000; ++ i)
v.insert (v.begin() + v.size() / 2, i); #else
typedef struct _list {
int* p;
struct _list* next;
} mylist;
mylist head;
for (uoff_t i = 0; i < 20000; ++ i) {
mylist* p = &head;
for (uoff_t j = 0; j < i / 2; ++ j)
p = p->next;
mylist* pnew = new mylist;
pnew->p = new int (i);
pnew->next = p->next;
p->next = pnew;
} #endif
return (0); }
Timings for array: real 0m0.308s user 0m0.308s sys 0m0.001s
Timings for linked list: real 0m12.746s user 0m12.685s sys 0m0.006s
Array insertions are 40 TIMES FASTER! The vector class for this implementation uses paged allocation with realloc.
(On 2proc Athlon MP 1GHz, Debian Linux SMP)
With 100000 items:
Timings for array: real 0m10.701s user 0m10.691s sys 0m0.004s
Timings for linked list: real 6m41.403s user 6m40.089s sys 0m0.167s
> how would you insert an new element in the middle? > How would you remove an arbitrary element.
For large datasets you should not be using a simple array. The key to performance is narrowing your scope and keeping your elements in a linked list 1000000 links long is not a solution. It is an atrocity. When you have this much data, the rule is to break it up! Put it in a B-tree. Put it in a file and create an index so you would only load what you need. Pipe it through a sliding buffer. But whatever you do, do NOT put it in a linked list! Do NOT keep it all in memory! Not everyone has 4G of RAM. Not everyone likes to see memory hog programs that occupy all the RAM they do not need.
> No, you over generalised the solution, and have > been continually attempting to weasel your way out > of it by making specific assumptions that favour you.
The way I see it is: I came up with a good solution and have been continually attempting to figure out why you find it unacceptable. So far I have been unsuccessful, and received nothing but insults for my trouble. Thank you very much!
> Again, you're making completely unwarranted > assumptions to steer the answer back to your > unsupportable over-generalisation.
I can't help making assumptions when you have inadequately specified the problem. If you specify all your requirements I would be much better able to present a design to suit them. One suit does not fit all and different requirements may result in completely different designs.
> How do you know when and how I want the values I'm storing?
How, indeed, if you don't tell me!
> I may need them immediately in some other thread.
Then set up a sliding buffer between the threads. If you need to keep all your data and pipe it between threads, I would create a triple-buffer pipeline: one is being written to, second is being read by the "other thread", the third is being paged out to a file. This way I keep all the data, process it as it comes through, use a minimal amount of memory, which can be user-configurable to run on any machine (try that with your linked list!), can restart the computation if the computer crashes (or the user wants to turn it off for the night, like most normal people do), is limited by disk space (unlike the linked list, which is memory limited) which is always more plentiful than RAM. Another advantage is that because the data is written down as it is generated, the user can run another program to read it and visualize it, possibly adjusting the original computation without having to finish it.
> A sliding buffer is fine, as long as you don't mind losing the early data.
You don't have to lose it. If you need to keep it, you can page it off to disk. Use asynchronous IO and double buffer, like you would for graphics. Fill one buffer while the other one is being written, then swap. Set buffer size to match disk performance. Sure you might block now and then, but really, if you deal with humongous amounts of data and need to keep it ALL, there is just no way to avoid using the disk.
> If you have an array of objects, you will have > to instantiate them one at the time anyway, so > you will still have your memory fragmentation.
An array of object pointers is still better than a linked list. You get random access, only a single dereference overhead, easy sorting, no carriers, and a mostly negligible resize operation. You also don't need to have a next member in your objects, which is an important consideration when you want to have a templated container that works on all data types. Carriers (the thing with the object pointer and next pointer) waste memory, increase fragmentation, and are hard to iterate over.
> Huh, I just want to be sure: You're not using > arrays with structures directly in it right?
Usually yes, I am.
> In this case, the cost of re-allocating the whole array is > *much* higher than the cost of reallocating an array of pointers!
Quite true. That's why you want to avoid having to do it by allocating a page at a time. realloc helps a lot as long as you work with one array at a time, but usually I just design my algorithms to not do random inserts or deletes. This optimizes your data for reading, which is what most data is for anyway.
> Anyway, this kind of array is only possible if > you use structures, it won't work for object
Yes it will, but you have to know what you are doing. If you use new [], then there is no problem, but if you use malloc, you need to call placement new to call the constructor of each object. If you use realloc, you have the additional restriction of bitwise copy capability, which means having no pointers to internal variables. All my containers are realloced, and I use a pointer array if I have something that doesn't fit.
> Besides, inserts and deletetions in an array are costly.
Right, but inserts and deletions are usually not a performance problem. Most data arrays are read a LOT more frequently than they are modified. In addition, it is almost always possible to combine your inserts and deletes into a single operation, thus minimizing copy. Profile before you optimize! You will find that reading is more critical than insert/delete.
> Right. Except because your code needs contiguous > memory, it throws an exception much earlier than > the linked lists
If I were actually running out of memory on a regular basis, I would first take a good hard look at the design and reduce memory consumption by any available methods, like sliding buffers or paging out old data to disk. In any case, an out-of-memory condition is a problem for your linked list design just the same except that you might be able to slide by a little longer. The problem is, what if some user has very little memory, like, say 32M? Unless you can actively limit memory consumption of your algorithm, the program will fail whether you use contiguous memory or not.
> That may be an acceptable trade off for better performance BUT IT MAY NOT.
> If I have to get a list of files in a directory, > including all the subdirectories. I don't have to > search through the results, I only have to go > through it. To reduce the overhead, I could even > use a single linked list instead of a double linked list.
Or you could completely eliminate the overhead by not storing the list at all, but by processing it as you read it. Take 'find', for instance: you can execute all your matches and commands on each entry without needing the entire list.
All right, so let's suppose you really do need the entire list. Will I be using a linked list to store the items then? Hell no! Paged allocation is the way to go here. Ok, so what if you need to modify other data while you are reading (like, say, if you are in a multithreaded environment where another thread is allocating memory)? Will I use a linked list then? Not exactly. I'll use a linked list of arrays to do paged allocation. Call it a habit, but my stomach turns at just thinking about allocating memory for a single item. It's just something I never do. Period. It's not a good way to deal with memory. Fragmentation, extra dereferencing, cache misses (two per element in a textbook implementation!), and total lack of access locality (not only is your element not in the cache, but you even lose the prefetch from processing the previous element), make linked lists totally unsuitable for any application. Call me a prude, but it is just something I wouldn't do to my computer.
> A list of elements, each with a data structure > representing results for each year for an > arbitrary number of years. So item 1 has existed > since 1990, so need 14 years worth of data, item 2 > has existed since 2001 so only needs 3. Item 3 has > existed since 1800 so needs 204 years.
Because you have more items than years, the vector overhead (24 bytes in for my implementation) would be a much smaller fraction of the data. The order of item data in the year depends on whether you care more about reading or about adding. For the former, do sorted insertion at the price of a copy, for the latter add them as they come at the price of a linear search instead of binary. The advantage over linked lists is continuity of data for each year; since you probably get yearly data for all items in one batch, adding them all simply consists of a sort and copy into one array. Because yearly blocks are likely to be large, they are good candidates for being in separate files, will compress well ('cause running zlib on a linked list is hard work:) Another advantage is that eventually you will have all the data for all the items that existed in 1996, so that year will not be modified any more. In addition to this you may need item information vector, which will have their range of existence along with all the other fixed size data you probably have on them.
> You're attempting to solve any non deterministic > decision problem, and want to save the state of > some object/variable at each iteration. Now tell > me, how much space am I going to need?
64K. You don't need to access the values you are storing until you are done, so you should write it to a file instead. Since file writes are expensive, buffer them through a small cache. Because you are probably CPU bound, you should allocate a buffer based on number of units you can process per bandwidth of your hard disk. Then turn on nonblocking IO and start your computation. At convenient stopping points call select to check if you can flush your buffer. If you succeed, clear the buffer and continue. This way you are maxing out both IO and CPU, are not bound by the size of the physical memory, are not a memory hog, and are crash-resistant because you conveniently checkpoint your job to storage. Then, when the job is finished, you can read however many gigabytes of data you just generated.
> How would you proceed if the corner case caused realloc() to return NULL?
In the same way you would proceed if you failed to add an element to your linked list: throw an exception. The exception should then be caught at the computation checkpoint level, the available data should be written to disk and the operation restarted.
> Sometimes you don't need searchability, just sequential access. > And thats easy with linked lists, and doesn't have the overhead of building a tree.
It is also easy with sliding buffers, where you have no overhead at all. If you have data just pouring in and you can access it sequentially, you should break it up into manageable packets and queue them in a sliding buffer. This way it is really easy to split work between several CPUs. Also, in the common case you would be heavily using the same memory location over and over (good for cache), with never a reallocation and no copying either if you can grain it on packet size.
> But, unless thats in your specs, if you assume > your data is going to behave nicely
And what sort of data you have that can not be made to behave nicely? Why don't you tell me what it is, and I'll see if I can figure out how an array can be used for it. Challenges are fun:)
> all you'll get is a program that doesn't work in corner cases > when it doesn't. Not smart. Not good programming.
Not quite right: the program will still work in all corner cases, it will just be slower sometimes. But corner cases are, by definition, rare, and an occasional performance hit might not even be noticable.
> Losing the original data wasn't the problem. The problem is, > you've got no place to put the new data, which was the > reason you did the realloc() in the first place.
This would be true for really large datasets that approach the size of your memory. The swap might save you for a while, but after that you really would have problems. In this case it really would be a bad thing to use an array. In fact, a linked list is a really bad solution here as well. What you need is a tree of some kind, so you can search it in a reasonable time. My machine has 512M of RAM, so by the time you hit this case you'll be working with something like 250M of data, and if it is in a linked list, it will take you hours to find element 31827481 and compare it with element 9481823. With data sets this big, which are frequently modified to boot, you are dealing with a database, not any regular array. So you'll have to write a database engine to get any kind of performance out of it. This is not, of course, what I had in mind; I was talking about mundane containers in the code for sundry data you may need to run it. Usually these are a few megabytes at the most, which is more than manageable.
> scientists say they have been trying to reduce > weight and improve the performance of onboad instrumentation.
I guess they had two options: enroll all the scientists in a weight loss program, or hack the instrumentation to weigh less. Looks like they opted for the latter.
> there may be no copying involved as it will extend your allocated block if possible
In the general case there is no copying involved. Most of the time when you add elements, you add them to one container at a time, so there will (usually) not be any more blocks above the one you got and it will extend indefinitely until you run out of elements. Second: if you do get a copy, you will most likely get it only once, since after that your block will be at the end of the memory space. Third: arrays do not grow indefinitely. Some elements get added, some get removed, and eventually it settles down to some steady state size after which point no memory allocation is going to happen, and steady state performance is more important than startup performance.
> hit the limit of the contiguous space above your initial > allocation and bang your program performance goes out of the window
If you use a linked list, your performance is not too high to begin with, from two dereferences and two cache misses for every element. As for the copy, a one-time event does not impact performance in any meaningful way; it is the common-case performance that matters.
> assuming you can get a sufficiently large contiguous > block of memory, otherwise, realloc returns NULL > and you're totally screwed.
First of all, you are not totally screwed, because you did not assign the return value of realloc directly to your pointer (you didn't, right?). realloc will not touch the original data if it fails reallocation. However, none of this matters on Linux, where realloc never returns NULL. With the "optimistic" memory strategy you get the pointer anyway and then you get killed by the OOM killer. Of course, new does the same thing, so your linked list is no improvement.
> But you're assuming that speed matters in this > section. The code I'm using a linked list for is > not used enough to make such optimisation > neccesary. When speed matters, I use an array.
Ah, but code size matters too. When you don't need to make it fast, make it small (and if you need to make it fast, small is a pretty good idea too). Array code is smaller and easier to debug than linked list code. You also get the advantage of being able to look at the entire array in the debugger (priceless!). This is why I never use anything but arrays and design all my code specifically for arrays: sequential access whenever possible, no random modifications, inserts only at the end, combine any modifications into blocks, etc. This is guaranteed to get you excellent cache usage (or at least, as good as it is going to get) and I have yet to see an application where I couldn't fit such a design.
> I'm pretty sure the implementation I use > allocates a new chunk of memory and copies.
Yes, that is correct, SGI STL implementation does just that. However, it also doubles the page size every time, so you get log2 allocations for your element adds, while a linked list gives you 2n allocations. It is possible to keep the old block, but you then have to make sure your objects can be bitwise copied (mostly means no pointers to internal variables) because there is no 'renew' and you get to use realloc. See http://sourceforge.net/projects/ustl for an STL implementation that does this.
void intvec::reserve (size_t n) {
if (n <= capacity())
return;
n *= sizeof(int);
size_t newSize = align (n, c_PageSize);
int* newBlock = (int*) realloc (m_Data, newSize);
if (!newBlock)
throw bad_alloc;// Note that m_Data is valid
m_Data = newBlock;
m_Allocated = newSize / sizeof(int);
fill (end(), begin() + capacity(), 0); }
It gets a little more complicated for object arrays (need to use placement new to call constructors), and I have omitted all the obvious accessor functions, but you can see that the code is pretty simple and quite doable on a job interview.
> Exactly how does a gay couple getting married affect you negatively?
Specifically, it will make me give them money. If your spouse dies in federal service, you are eligible for money. From taxes. Which I pay.
But that is really not the issue. The problem with gay marriage is that it grants a moral sanction to homosexuality. When made by an individual, I simply shrug, since his opinion has no value to me. When made by the government, which is supposed to be representing the interests and morals of all the people, it is reprehensible. It is the same reason why the government does not allow you to do drugs; so that if you do it, you do not have delusions that society approves of your actions. A drug user down the street does not hurt me in any way and I couldn't care less what he does in his living room. If he goes out driving and kills somebody, he goes to jail and loses his license; and that's fair. But I certainly don't want him to think that I approve of what he does. Just as I would not approve of him molesting children, having sex with cows, performing satanic rituals, torturing animals, or having sex with other men. If he wants to do it anyway, let him do it cowering and hiding, in shame and regret. Then maybe he will eventually stop. Until he does, I certainly want nothing to do with him. If I give a party, I will not invite him; if I have a job opening, I will not give it to him; if I run a store, I will sell him any goods. Of course, the government will forbid me from doing all that. And since they have the guns, I will comply. But nobody can make me like it, and that is the irony. The society still disapproves and the transgressors will know it.
>> If you keep advocating legalisation of gay
marriage,
>> God will most certainly want nothing to do with you.
>
> You talk to him do you?
I don't have to. He certainly has made his point throughout the Bible, and I can't think of a more convincing proof of God's opinion than destruction of Sodom and Gomorrah for homosexuality, among other equally disguisting things. Of course, being an atheist, I only read about it out of curiosity. My views on sodomy have nothing to do with God.
> if he does not want to see interracial marriage
> become an accepted practice, surely it is the
> right thing to do. Right?
Of course it is. Everyone is entitled to a vote in this country, and if he wants to vote against interracial marriage, it's his choice.
> People used to justify stomping on the civil rights of
> black people just like they currently are doing to gay people.
You forget that the rights you currently possess have been granted to you by society. You do not have a right to steal, kill people, practice poligamy, snort cocaine, spread child pornography, and other crimes. Your city may choose to grant you other rights, like specify what you can and can not do with your property. The city might require that you mow your front lawn on a regular basis, clean the sidewalk, keep your house paint looking good, provide access to wheelchaired people, etc. The "civil rights" you now take for granted were likewise granted by society's approval. Giving them to blacks was also granted in like manner, by recognizing them as human beings and thus equal to the white race before the law. But these rights do not just come out of thin are; they are a representation of the moral code of our society and are not unchanging. They can come and go and they will, for blacks, for whites, and for gays.
> Your kids are not going to turn gay just because they see gay people.
Nobody is afraid that their kids will turn gay by looking at other gays. However, they might not want their kids to associate with gays, just as they restrict their kids' social activities with other types of undesirable people.
> don't know about you, but it wouldn't have mattered
> how many gay men I saw kiss, boobies would still have given me a woody.
The issue is not whether you would turn gay, but whether you want to see gay men kiss in the first place. Just as we do not tolerate other disguisting habits some people have, so gay behaviour is definitely not socially approved.
> People are so afraid of turning gay - makes me wonder if they're *already* gay.
I don't know why I keep seeing this accusation. When I say I dislike drug users, nobody says "well, it's just because you're afraid of becoming one." Gosh darn it. That must be it. I'm so afraid I might start using drugs, I want to keep away from all those loony druggies.
> The constitution was writen so that people would
> not be abused by the government, ANY PEOPLE
So what about people who are disgusted by homosexuality? Why should they be forced by the government to acknowledge gay couples as "married"? Marriage is a an important concept in our society as well as in the legal system. A spouse gains many legal and social rights that are not available to even close friends. Why should anyone be forced to grant those rights to people in a relationship of such a perverted nature?
> I would certaintly say that homosexuals qualify as a minority group.
Then so do those who torture squirrels in their back yards, lust after eleven year old girls, or enjoy eating their own excrement. Because, you know what? That's what those people you call "homophobic" really think of gays. It's not a fear, it's revulsion. It's not hate, it's avoidance. And just as people avoid child molesters, drug users, sadists, and other perverts, they want to avoid gays. Hey, they might
even want the word "gay" back to its original meaning! Why should you let the government force gays and their ways upon the recoiling society?
> Just because its a democracy doesn't mean that
> the whatever the majority says goes.
Sorry. Democracy is majority rule by definition. If you want to be special, it is not a democracy any more. It's an aristocracy. Ours is an aristocracy of pull - if you can contribute enough money to your friends in Washington, you can do pretty much anything you want.
> And god help us if it ever does start working like that.
If you keep advocating legalisation of gay marriage, God will most certainly want nothing to do with you.
> that won because he openly advocated limiting
> civil rights of an etnic group, and used it to divide the country.
People want to have a president who shares their values in order to have the kind of a country they want to live in. In case you haven't noticed, people in 11 states decided that gay marriage was unacceptable. I suspect that majorities in other states hold similar views, and if the majority wants something, in a true democracy they would get it, since that is the definition of democracy.
> It sickens me to think that people who never voted
> before said "Whoa, nothing else has mattered to me in
> the last 20 years, but the QUEERS WANT TO GET MARRIED!
> Jarlene, find me my votin' hat!"
When you see the moral standards of your society being destroyed, what good man would not act, if the decline could be stopped, or at least slowed by simply showing up and voting NO? If he does not want homosexuality to become an accepted practice, surely it is the right thing to do.
> Despite the alleged "split" in the country.... There were no riots in the street.
Come on now, the election results only came out this morning. Who could possibly have had time to organize a riot? Give them a few days...
> I can look anywhere I like for counter examples,
:) I said that an array was always better than a linked list. I did not say that you should use arrays instead of all other data structures. Of course you would use classes for record data, trees for databases, and may be even *gasp* unions for some overlapping data. My point is simply that linked lists are an inferior construct and should always be replaced with something better, even if it is only a linked list of arrays for paging your element additions. As I said in another post, it makes me extremely uneasy to allocate memory for a single element, no matter what, so even when I use linked structures, they are paged linked structures like the B-tree. "Many at once" philosophy will always do better than "One at a time".
> because you said always. Thats what the arguments
> about, remember?
Do you remember?
> list.cpp, inserting elements into the middle of a
> linked list using the STL list template:
You are not doing the same thing. I did not intend to simulate insertion exactly in the middle. If my insertion point remained the same, I could simply preallocate a hole in the vector and copy the new elements into it. So no, it does not blow a hole in my theory.
> Given an array of 1000000 large elements, how
> would you insert an new element in the middle?
I have written a sample program to test insertion in the middle of an array (paged allocation and copy to shift) as compared to the same operation on a linked list. Arrays win by a factor of 40: here
real 0m0.308s
user 0m0.308s
sys 0m0.001s
Timings for linked list:
real 0m12.746s
user 0m12.685s
sys 0m0.006s
Array insertions are 40 TIMES FASTER! The vector class for this implementation uses paged allocation with realloc.
(On 2proc Athlon MP 1GHz, Debian Linux SMP)
With 100000 items:
Timings for array:
real 0m10.701s
user 0m10.691s
sys 0m0.004s
Timings for linked list:
real 6m41.403s
user 6m40.089s
sys 0m0.167s
Conclusion: NEVER USE LINKED LISTS!
> Given an array of 1000000 large elements
:)
Look who's making assumptions now!
> how would you insert an new element in the middle? > How would you remove an arbitrary element.
For large datasets you should not be using a simple array. The key to performance is narrowing your scope and keeping your elements in a linked list 1000000 links long is not a solution. It is an atrocity. When you have this much data, the rule is to break it up! Put it in a B-tree. Put it in a file and create an index so you would only load what you need. Pipe it through a sliding buffer. But whatever you do, do NOT put it in a linked list! Do NOT keep it all in memory! Not everyone has 4G of RAM. Not everyone likes to see memory hog programs that occupy all the RAM they do not need.
> No, you over generalised the solution, and have
> been continually attempting to weasel your way out
> of it by making specific assumptions that favour you.
The way I see it is: I came up with a good solution and have been continually attempting to figure out why you find it unacceptable. So far I have been unsuccessful, and received nothing but insults for my trouble. Thank you very much!
> Again, you're making completely unwarranted
> assumptions to steer the answer back to your
> unsupportable over-generalisation.
I can't help making assumptions when you have inadequately specified the problem. If you specify all your requirements I would be much better able to present a design to suit them. One suit does not fit all and different requirements may result in completely different designs.
> How do you know when and how I want the values I'm storing?
How, indeed, if you don't tell me!
> I may need them immediately in some other thread.
Then set up a sliding buffer between the threads. If you need to keep all your data and pipe it between threads, I would create a triple-buffer pipeline: one is being written to, second is being read by the "other thread", the third is being paged out to a file. This way I keep all the data, process it as it comes through, use a minimal amount of memory, which can be user-configurable to run on any machine (try that with your linked list!), can restart the computation if the computer crashes (or the user wants to turn it off for the night, like most normal people do), is limited by disk space (unlike the linked list, which is memory limited) which is always more plentiful than RAM. Another advantage is that because the data is written down as it is generated, the user can run another program to read it and visualize it, possibly adjusting the original computation without having to finish it.
> A sliding buffer is fine, as long as you don't mind losing the early data.
You don't have to lose it. If you need to keep it, you can page it off to disk. Use asynchronous IO and double buffer, like you would for graphics. Fill one buffer while the other one is being written, then swap. Set buffer size to match disk performance. Sure you might block now and then, but really, if you deal with humongous amounts of data and need to keep it ALL, there is just no way to avoid using the disk.
> If you have an array of objects, you will have
> to instantiate them one at the time anyway, so
> you will still have your memory fragmentation.
An array of object pointers is still better than a linked list. You get random access, only a single dereference overhead, easy sorting, no carriers, and a mostly negligible resize operation. You also don't need to have a next member in your objects, which is an important consideration when you want to have a templated container that works on all data types. Carriers (the thing with the object pointer and next pointer) waste memory, increase fragmentation, and are hard to iterate over.
> Huh, I just want to be sure: You're not using
> arrays with structures directly in it right?
Usually yes, I am.
> In this case, the cost of re-allocating the whole array is
> *much* higher than the cost of reallocating an array of pointers!
Quite true. That's why you want to avoid having to do it by allocating a page at a time. realloc helps a lot as long as you work with one array at a time, but usually I just design my algorithms to not do random inserts or deletes. This optimizes your data for reading, which is what most data is for anyway.
> Anyway, this kind of array is only possible if
> you use structures, it won't work for object
Yes it will, but you have to know what you are doing. If you use new [], then there is no problem, but if you use malloc, you need to call placement new to call the constructor of each object. If you use realloc, you have the additional restriction of bitwise copy capability, which means having no pointers to internal variables. All my containers are realloced, and I use a pointer array if I have something that doesn't fit.
> Besides, inserts and deletetions in an array are costly.
Right, but inserts and deletions are usually not a performance problem. Most data arrays are read a LOT more frequently than they are modified. In addition, it is almost always possible to combine your inserts and deletes into a single operation, thus minimizing copy. Profile before you optimize! You will find that reading is more critical than insert/delete.
> Right. Except because your code needs contiguous
> memory, it throws an exception much earlier than
> the linked lists
If I were actually running out of memory on a regular basis, I would first take a good hard look at the design and reduce memory consumption by any available methods, like sliding buffers or paging out old data to disk. In any case, an out-of-memory condition is a problem for your linked list design just the same except that you might be able to slide by a little longer. The problem is, what if some user has very little memory, like, say 32M? Unless you can actively limit memory consumption of your algorithm, the program will fail whether you use contiguous memory or not.
> That may be an acceptable trade off for better performance BUT IT MAY NOT.
How about a specific example?
> If I have to get a list of files in a directory,
> including all the subdirectories. I don't have to
> search through the results, I only have to go
> through it. To reduce the overhead, I could even
> use a single linked list instead of a double linked list.
Or you could completely eliminate the overhead by not storing the list at all, but by processing it as you read it. Take 'find', for instance: you can execute all your matches and commands on each entry without needing the entire list.
All right, so let's suppose you really do need the entire list. Will I be using a linked list to store the items then? Hell no! Paged allocation is the way to go here. Ok, so what if you need to modify other data while you are reading (like, say, if you are in a multithreaded environment where another thread is allocating memory)? Will I use a linked list then? Not exactly. I'll use a linked list of arrays to do paged allocation. Call it a habit, but my stomach turns at just thinking about allocating memory for a single item. It's just something I never do. Period. It's not a good way to deal with memory. Fragmentation, extra dereferencing, cache misses (two per element in a textbook implementation!), and total lack of access locality (not only is your element not in the cache, but you even lose the prefetch from processing the previous element), make linked lists totally unsuitable for any application. Call me a prude, but it is just something I wouldn't do to my computer.
> representing results for each year for an
> arbitrary number of years. So item 1 has existed
> since 1990, so need 14 years worth of data, item 2
> has existed since 2001 so only needs 3. Item 3 has
> existed since 1800 so needs 204 years.
I would use the following:Because you have more items than years, the vector overhead (24 bytes in for my implementation) would be a much smaller fraction of the data. The order of item data in the year depends on whether you care more about reading or about adding. For the former, do sorted insertion at the price of a copy, for the latter add them as they come at the price of a linear search instead of binary. The advantage over linked lists is continuity of data for each year; since you probably get yearly data for all items in one batch, adding them all simply consists of a sort and copy into one array. Because yearly blocks are likely to be large, they are good candidates for being in separate files, will compress well ('cause running zlib on a linked list is hard work
> You're attempting to solve any non deterministic
> decision problem, and want to save the state of
> some object/variable at each iteration. Now tell
> me, how much space am I going to need?
64K. You don't need to access the values you are storing until you are done, so you should write it to a file instead. Since file writes are expensive, buffer them through a small cache. Because you are probably CPU bound, you should allocate a buffer based on number of units you can process per bandwidth of your hard disk. Then turn on nonblocking IO and start your computation. At convenient stopping points call select to check if you can flush your buffer. If you succeed, clear the buffer and continue. This way you are maxing out both IO and CPU, are not bound by the size of the physical memory, are not a memory hog, and are crash-resistant because you conveniently checkpoint your job to storage. Then, when the job is finished, you can read however many gigabytes of data you just generated.
> How would you proceed if the corner case caused realloc() to return NULL?
In the same way you would proceed if you failed to add an element to your linked list: throw an exception. The exception should then be caught at the computation checkpoint level, the available data should be written to disk and the operation restarted.
> Sometimes you don't need searchability, just sequential access.
> And thats easy with linked lists, and doesn't have the overhead of building a tree.
It is also easy with sliding buffers, where you have no overhead at all. If you have data just pouring in and you can access it sequentially, you should break it up into manageable packets and queue them in a sliding buffer. This way it is really easy to split work between several CPUs. Also, in the common case you would be heavily using the same memory location over and over (good for cache), with never a reallocation and no copying either if you can grain it on packet size.
> Your arrays might do that,
:)
Yes they do. All of them.
> But, unless thats in your specs, if you assume
> your data is going to behave nicely
And what sort of data you have that can not be made to behave nicely? Why don't you tell me what it is, and I'll see if I can figure out how an array can be used for it. Challenges are fun
> all you'll get is a program that doesn't work in corner cases
> when it doesn't. Not smart. Not good programming.
Not quite right: the program will still work in all corner cases, it will just be slower sometimes. But corner cases are, by definition,
rare, and an occasional performance hit might not even be noticable.
> Losing the original data wasn't the problem. The
problem is,
> you've got no place to put the new data, which was the
> reason you did the realloc() in the first place.
This would be true for really large datasets that approach the size of your memory. The swap might save you for a while, but after that you really would have problems. In this case it really would be a bad thing to use an array. In fact, a linked list is a really bad solution here as well. What you need is a tree of some kind, so you can search it in a reasonable time. My machine has 512M of RAM, so by the time you hit this case you'll be working with something like 250M of data, and if it is in a linked list, it will take you hours to find element 31827481 and compare it with element 9481823. With data sets this big, which are frequently modified to boot, you are dealing with a database, not any regular array. So you'll have to write a database engine to get any kind of performance out of it. This is not, of course, what I had in mind; I was talking about mundane containers in the code for sundry data you may need to run it. Usually these are a few megabytes at the most, which is more than manageable.
> scientists say they have been trying to reduce
> weight and improve the performance of onboad instrumentation.
I guess they had two options: enroll all the scientists in a weight loss program, or hack the instrumentation to weigh less. Looks like they opted for the latter.
> there may be no copying involved as it will extend your allocated block if possible
In the general case there is no copying involved. Most of the time when you add elements, you add them to one container at a time, so there will (usually) not be any more blocks above the one you got and it will extend indefinitely until you run out of elements. Second: if you do get a copy, you will most likely get it only once, since after that your block will be at the end of the memory space. Third: arrays do not grow indefinitely. Some elements get added, some get removed, and eventually it settles down to some steady state size after which point no memory allocation is going to happen, and steady state performance is more important than startup performance.
> hit the limit of the contiguous space above your initial
> allocation and bang your program performance goes out of the window
If you use a linked list, your performance is not too high to begin with, from two dereferences and two cache misses for every element. As for the copy, a one-time event does not impact performance in any meaningful way; it is the common-case performance that matters.
> assuming you can get a sufficiently large contiguous
> block of memory, otherwise, realloc returns NULL
> and you're totally screwed.
First of all, you are not totally screwed, because you did not assign the return value of realloc directly to your pointer (you didn't, right?). realloc will not touch the original data if it fails reallocation. However, none of this matters on Linux, where realloc never returns NULL. With the "optimistic" memory strategy you get the pointer anyway and then you get killed by the OOM killer. Of course, new does the same thing, so your linked list is no improvement.
> But you're assuming that speed matters in this
> section. The code I'm using a linked list for is
> not used enough to make such optimisation
> neccesary. When speed matters, I use an array.
Ah, but code size matters too. When you don't need to make it fast, make it small (and if you need to make it fast, small is a pretty good idea too). Array code is smaller and easier to debug than linked list code. You also get the advantage of being able to look at the entire array in the debugger (priceless!). This is why I never use anything but arrays and design all my code specifically for arrays: sequential access whenever possible, no random modifications, inserts only at the end, combine any modifications into blocks, etc. This is guaranteed to get you excellent cache usage (or at least, as good as it is going to get) and I have yet to see an application where I couldn't fit such a design.
> I'm pretty sure the implementation I use
> allocates a new chunk of memory and copies.
Yes, that is correct, SGI STL implementation does just that. However, it also doubles the page size every time, so you get log2 allocations for your element adds, while a linked list gives you 2n allocations. It is possible to keep the old block, but you then have to make sure your objects can be bitwise copied (mostly means no pointers to internal variables) because there is no 'renew' and you get to use realloc. See http://sourceforge.net/projects/ustl for an STL implementation that does this.
> m_Size = n * sizeof(int);
:) (or at least figured out what the units of the member variables ought to be before writing the code)
Replace with
m_Size = n;
Really should have used the preview button
Well, not in two lines, but it's not too hard:It gets a little more complicated for object arrays (need to use placement new to call constructors), and I have omitted all the obvious accessor functions, but you can see that the code is pretty simple and quite doable on a job interview.