Well, as usual the story is more complex than that. The drive makers have recovered from the floods but tried to keep prices high artificially by holding back inventory. Now their inventory levels are eating them alive so I expect prices to ease soon.
They had a good run but at the end of the day they have to produce volume and if demand doesn't keep up prices have to go down to compensate. Pretty simple.
Well, kinda typical of the retail investor sentiment but you probably don't have to sue anyone. If it happened just as you said, and you didn't mess around too much afterwords other than try to sell what you had, then your broker will have a record of it. Contact your broker so you are in the queue (your broker is probably handling thousands of complaints already). Your broker will probably negotiate compensation with the NASDAQ on behalf of all of its customers, including you, and this is your best bet is to get the refund through your broker and not try to bypass them.
Just be sure you make a formal complaint. If you don't you could wind up missing out.
Suing... despite what you hear in the press that's always a last resort, and generally does not yield results for the actual abused clients that they had hoped for.
When the internet crash occurred circa ~2000 there were literally thousands of lawsuits, and hundreds made class-action status. For the next TEN YEARS I would, every so often, receive a letter from a lawyer that included a check for my portion of some class action settlement or other... all in all, I've received about 5-7 checks over the years. Most of them I just threw away... literally pennies on the dollar, not even worth cashing. Another couple I returned to the Judge with a note saying how worthless this was to the supposed victims and he could keep the money. One or two of them were large enough amounts to deposit and buy a meal at a nice restaurant with. And that's about it. Getting satisfaction from a lawsuit (as a passive litigant in my case, simply by being part of the class), is basically impossible.
That all said, I'm also not in a particularly charitable mood (as the other poster indicated). You clearly intended to simply day-trade the stock and hope for a quick bump, yah? You were gambling, not investing. Otherwise you wouldn't have minded holding on 'for the long term'. Nobody was holding a gun to your head. Though the NASDAQ screw-up was unprecedented, these are the kinds of risks you take when you get involved in IPOs.
AOL's problem was that they depended on users dialing directly into AOL's service... i.e. they went from dialup provider + portal to just portal and from there to irrelevance. Google and MySpace had little to do with it... the internet outgrew AOL's (and Yahoo's) model.
Lets not forget Netscape... same thing. Browser -> Portal -> irrelevance.
Don't invest in things you can't value. Pretty simple answer.
In anycase, I think you might be thinking of Amazon, not Google. Google has been an incredible investment pretty much throughout. It's value was cut in half during the 2008 crash but it recovered almost completely to its pre-crash high in just a year and a half. A year and a half worth of 'swoon' isn't a big deal for a real investor.
Amazon, on the otherhand, which also did well out of the box, rode the internet hype all the way to the internet crash and then didn't recover to its pre-internet-crash heights for nearly another 10 years (until the 2008 crash), then dived, and finally recovered for good in late 2009. Now, of course, Amazon is doing a lot better, but having to sit on an investment for almost 10 years post-internet-crash definitely counts as a bad bet.
Apple is in a similar situation now, with people prognosticating its health or death based on short-term (a few months) movement of its stock. The jury is still out. But today's world is a much different world than the internet boom and bust was.
That's easy. I'm a small investor and haven't been impeded by the big guys at all. They have provided opportunity after opportunity for buying and selling since the crash.
Most of the people on the boards I frequent (80% of which are retired in well into their 80's, by the way), also haven't had any real trouble with the big guys. So if you are going to argue that you are stupider than a bunch of 80+ year old farts you are putting yourself into a corner. I'm a 45 year old not-quite-an-old-fart-yet and some of those guys do a better job than me (and I'm pretty good).
This is another one of those media-hyped stories that just isn't true. People are afraid of the markets, but they're afraid because the talking heads are telling them to be afraid, not because there is actually anything to fear.
To be fair, I would say that not everyone has what it takes to be an investor. There is volatility, it is possible to lose money... and the shorter-term view you have of the market the more money you can lose. And, unfortunately, most people (particularly younger people) have a very, very short-term view of the market. The vast majority of retail investors these days don't actually 'invest'. They (a) day-trade when they think they are investing and (b) don't have any real savings to invest with anyway. For that matter, people tend to not understand the vast, vast, VAST economic risks they take just having credit card debt.
I can give you an endless number of examples of this but perhaps the easiest to understand is to ask why people didn't invest during the crash once the Dow went below 9000. If you look at the graph in hind-sight, even though the Dow dipped well below 7000 before eventually bottoming, the period of time it spent below 9000 was only around half a year.
During that period people basically stopped thinking, believed in the end-of-the-world stories, and lost out on one of the biggest bull runs in history. And you didn't even have to predict the bottom to do it... even investing at 9000 with the market still dropping another 30% before bottoming... those people made out like bandits simply by being patient.
Nobody has patience these days, and that is a very bad fit for actually being able to become a good investor.
Generally speaking (and ignoring FB which I've already commented on)... but generally speaking this is NOT true. The small guy actually has the advantage in this market, which makes it ironic that the small guys have mostly abandoned it.
The big guys have been fighting amongst themselves since the crash and it has created lots of opportunities for smaller retail investors to find really excellent entry points. Simply put, the reduced liquidity in the market gives the advantage over to the smaller players whos trades don't move stocks while the bigger ones get stuck fighting each other.
It used to be that 'dumb money'... a euphemism for the 'retail investor', gave the markets enough liquidity to allow the bigger players to enter and exit positions without excessively moving stock prices. These days with the big boys playing against each other and reduced liquidity it's more a matter of one big boy outwitting another because their trades move the underlying stocks too much. The small guys can take advantage of the much more obviously oversold conditions to buy, and overbought conditions to sell. The big guys can't.
The problem that a lot of retail investors have is that they don't actually know how to invest... they think they are investing when they are actually just day-trading. They pile into dangerous spaces that have already built up momentum to the upside instead of buying when they were low. For example, smaller players are STILL piling into the muni/govt bond markets even as we speak despite the huge risks involved as the Fed QE2 ends. Most retail investors sell during the inevitable pullbacks in these spaces (instead of selling during the rise), or buy well after a security has risen (instead of when it was closer to the bottom and still falling). They believe the crap that is fed to them by the media, believe the hype, believe the stories written by 13 year olds or guys with fancy titles and obvious conflicts of interest, and don't bother reading the financials of the companies they invest in or even listen in on the conference calls.
It doesn't take all that much work to actually invest properly, it just takes a bit of patience and a minimum of a medium term view (instead of a short-term reactionary view). The best investors in this market aren't the idiots who day-trade, it's the people who might do one or two small trades a week, maximum, slowly working long-term positions and collecting dividends while the big boys rattle the market back and force and provide the great entry and exit points.
The deck just isn't stacked against us, people only believe it is.
Correction, the updated guidance (that the analyst based his revised opinion on) occurred several weeks ago... so investors have even LESS of a reason to complain. This wasn't news. And, again, just because an analyst reduces their opinion based on reduced guidance doesn't change the fact that it was still just an opinion, one based on information that had already been widely disseminated weeks earlier.
So... I don't particularly like MS and I don't particularly think they did a good job, but good luck trying to blame them for this mess. People have only themselves to blame if they bought into this IPO.
Actually, MS came out with a statement indicating that the all IPO members (both retail and institutional investors) received updated guidance during the roadshow via a revision to the S1, and that the pricing of the IPO included that guidance. The analyst opinion was simply reflective of the revised guidance.
You'd have to be pretty stupid to assume that analysts wouldn't revise their opinions based on the change in guidance.
Well, you'd have to be pretty stupid to participate in the IPO in the first place, let alone invest in the stock. The thing was overpriced, the talking heads said it was overpriced, a simple high school math calculation would tell you it was overpriced, most people KNEW it was overpriced... and bought it anyway hoping for another 'sure bet' circa the internet frenzy leading up to the internet crash circa ~2000.
In some respects this is a good thing, it brings a much needed dose of reality to fuzzy-brained armchair investors.
If you want to complain about something you can complain about the NASDAQ screwing up the opening and not providing trade confirmations for 3+ hours to investors whos money was locked up and who could only watch the price start to drop without knowing whether they even owned shares, or being able to sell.
It isn't that easy. Small buffers actually result in a high level instability that can leave them completely empty as often as it leaves them completely full. If the buffer is too small performance goes completely to hell on a permanent basis. It becomes highly chaotic.
There are plenty of situations where you want a bigger buffer to handle a very short-term burst of information, but where that same big buffer becomes a liability when the information is continuously in excess of available bandwidth.
Similarly there are plenty of situations where a small buffer gives you very nice low latencies... but only for the packets that manage to make it through the router. That same small buffer will also drop way too many packets in numerous situations. Worse, the buffer is too small for other protocols at the edges to be able to react to changes in the situation. No protocol can react instantly but a small buffer pretty much requires all packet sources to react instantly. It just doesn't work.
So big buffers have latency issues and small buffers have stability issues.
In anycase, neither of you two are even remotely close to being right. Having large buffers and 'solving' the latency issue by reordering packets is not an implementation, it's a desire. Anyone who has dealt with network backlogs knows that no matter how big your buffer is incoming packets can always fill it to the brim when output is unable to keep up with input. So at some point, no matter what, you have to drop packets. Simply dropping based on a time deadline is a horrible solution... that's even worse than RED. Assuming that you never have to drop anything is also a horrible solution because that implies fair-queuing at the center of the network which, due to the number of simultaneous connections running through the center of the network requires effectively infinite buffer space... a nice concept, but no realistic.
So, huge buffers with fancy latency-based algorithms don't fix the problem. And small buffers don't fix the problem either.
It's a part of the solution but not a complete solution. Boundary problems are different from center-of-network problems. Playing with TCP window sizes only works well at the edges and only in the outgoing direction (the 'inflight' sysctls that we've had forever), but does not completely solve the problem because you always need to add one or two additional packets above and beyond the calculated sweet spot to absorb changes in latency from other links and give the algorithm time to respond whenever reality changes.
This solution in both incoming and outgoing directions suffers from a connection multiplication problem. That is, it works fine if you have only a few simultaneous TCP connections running but it breaks down when you have dozens or hundreds due to the need to have 1-2 extra packets of slop in the reported window. It degrades gracefully in the outgoing direction but blows up very quickly when you are trying to control bandwidth in the incoming direction by changing the window size you report available in the outgoing ACKs (anyone running torrents can tell you this but the problem occurs for any busy network).
However, bandwidth limiting in this fashion DOES reduce packet backlogs and queues significantly. Not enough (it's simply impossible to make the algorithm stable at the sweet spot so you always need 1-2 additional packets), but significantly. Hence it is part of the solution.
Step 2: On speed boundary changes you have to run fair queuing, period. Not only that but you have to do it on both sides of the boundary (i.e. in both directions each at the choke point). In my case I could never get truly reliable operation by only running fair queuing in the outgoing direction. I had to actually run the fair queue on *both* ends of the link, meaning I had to colocate a server to serve as the terminus for a VPN and run all the traffic over the VPN so I could control both ends. The fair queue also has to reserve bandwidth for pure TCP ACKs to prevent restricting the bandwidth in the opposite direction due to ACK starvation.
Fair queuing takes care of all remaining packet buffering issues at the edges of the network.
Center-of-network issues can't use the above solutions simply because there are too many connections flowing through the center of the network to track and not enough packet buffer space to sufficiently buffers all those connections for fairq operation (N x B is just too big). It's impossible to calculate where the choke points are based on connection tracking at the center-of-network. Easy at the edges, impossible in the center.
So the center-of-network has to provide additional congestion control through some sort of AQM, and if early-warning requests go unheeded it must start dropping packets. Personally speaking I hate the idea of having to drop packets, it leads to all sorts of problems everywhere, but the simple fact of the matter is that there is not enough per-connection buffer space at the center of the network to run fairq, nor is it an appropriate place for that. Without tracking you cannot mess with tcp window sizes in the center-of-network and even if you could track you can't bundle the connections based on where the choke point is at the moment, there is simply not enough information.
So we are talking at least three and probably closer to half a dozen different mechanisms being needed to reduce network latencies and still provide good and fair performance.
I run a VPN to a colo so I have full control over the packet stream in both direction. I use PF on both sides with fair-queue (DragonFly of course), service separation, a separate channel for pure acks, etc. Works great, actually. I gang the VPN across both COMCAST and U-VERSE and tend to run full-out in both directions for long periods of time (I can even run it over 3G but my Android phone crashes pretty quickly when I do that.. oh well).
My cable bill has been about the same for the last 5 years, the only difference now is that the 'internet' portion of the bill is larger and the channel portion (I'm now down to basic cable) is smaller. But COMCAST is still getting their hunk of flesh out of me.
The COMCAST link has been the most reliable, and their bandwidth controls are fairly predictable. Even with the highest-speed internet plan they offer I can only count on around 2MBytes/sec downlink on a continuous basis, and around 250 KBytes/sec uplink. The only real issue I've had w/COMCAST is that their cable modem insists on assigning a non-routable IP when the physical link is down, but that was easily solved with a 'reject 192.168.100.1;' line in/etc/dhclient.conf.
I also have AT&T U-Verse... my advise, stick with COMCAST or, if you absolutely have to use U-Verse don't bother with the static IPs. Basically AT&T doesn't know what they are doing when it comes to providing internet access. Their U-Verse crap uses MAC based filtering and can't handle multiple IPs behind a router, and it loses its mind every once in a while. The only semi-reliable way to connect to it is via a hard port on the WIFI router they supply through NAT. Basically you have to use their NAT service and WIFI router (though you can use a hard port on the router) to get anything even remotely reliable, and you can't do any fancy MAC filters because their WIFI router forgets about them every so often and breaks you. Fortunately the VPN runs over the NAT service just fine.
Sustained downlink and uplink rates with U-Verse are about 50% what I can get with COMCAST, only about 1 MByte/sec downlink and only around 150 KBytes/sec uplink with their highest rated service. If I go any higher I hit long periods of time where their link can't sustain the bw and packets start to build up on their routers instead of mine (where I can't control them), killing ping times.
I had long conversations with two sets of AT&T techs. The second guy knew what he was doing and cleaned up the copper to absolute perfection, and I can check the stats on the short-haul DSLx2 lines (that just go to the corner of the street), so these bw and reliability issues are not related to my twisted pairs.
They (AT&T) are lying if they say you get more than that in any sort of sustained manner. It's limited by their upstream... the physical link can handle more and but even though I don't use the U-Verse TV service their hardware still reserves uplink bw for it, which is annoying as hell, and their uplink and backbone is clearly under-provisioned.
Still, at least AT&T is creating some competitive pressure on COMCAST. Not a whole lot, but some. I wish Verizon had fiber in my area.
I've noticed a few other issues with U-Verse. AT&T's backbone hits log-jams every so often, drops a lot more packets than COMCAST just generally, and appears to choose routes to various places that go through problem-prone backbone infrastructure. It's quite annoying.
I had a DSL line for a while too, but uplink speeds from my location are a joke and the pricing is no longer competitive for the meager amount of bw I can get. It was reliable, but couldn't deliver the bw.
--
Insofar as streaming T.V. goes, it works pretty well here. NetFlix, Hulu, other apps. Strangely enough it works better through my VPN than it does directly, probably because neither COMCAST or AT&T can do content-aware filtering of an encrypted VPN's UDP stream (and ganging aside). I mostly use Apple products &
There was a list of keywords the CIA was known to filter on, so we'd often just insert them randomly into postings so they'd get read by some poor overworked CIA analyst.
Yah, I get those two mixed up all the time, and will continue to probably for the rest of my life. On the bright side people know it's actually me doing the posting when they read that and a few other grammatical mistakes that I often make.
Once in the late 1990's we had a weird bug where FTPing or RCPing a particular file between two offices would often result in a corrupt file on the other end. We kept scratching our heads trying to figure out what could possibly be corrupting the file. FTPing it anywhere else succeeded... no corruption. Everything else between the offices seemed to work ok.
It wound up being a hardware issue with the T3 between the two offices. The hardware would corrupt the bitstream in a manner that tended to PASS the TCP/IP checksum, resulting in corrupted data. It required a particular pattern of 1's and 0's for the bitstream to be corrupted in a manner that passed the checksum, which this particular file happened to have.
These days, of course, I use scp to transfer files whenever possible. SSH will detect that sort of corruption and fail with a protocol error. Encryption has certain uses beyond just encrypting the data, it seems!
Intel has had quite a few serious chip bugs too, all in errata. A number of new cpu bugs in both AMD and Intel chips always appears in new generations, but both companies have very large test suites and the number of new bugs goes down in every generation.
Don't forget that Intel had to recall a sandybridge chipset early in the sandybridge cycle, which cost them something like a billion dollars because the related motherboards had to be thrown away and replaced. That was due to internal on-chip circuitry related to a SATA port burning out.
Right at this moment AMD has two issues facing it in order to compete on workstations: (1) Power and (2) Performance. Their initial bulldozer release clearly depends too much on compiler optimizations to make full use of the architecture. They will clearly have to bulk-up some of the simplifications they made that made their cpu cores a little too sensitive to instruction sequences generated by compilers and I hope their next few releases will do better.
On power consumption it comes down to the Fab as much as anything else. Their dependence on the Fab is clearly a problem and they've made a break for it to try to solve it, even though it is costing them dearly. At the same time Intel has made some major advances in their three fabs, to the point where Intel can do their entire production on just two of those three fabs now but they decided to keep the third fab because they think they can 'grow into' it.
So AMD definitely has some work ahead of it, and I am hoping they reserve some of their focus for the high-end and don't concentrate entirely on laptops. I always like to say that I love AMD, but in the stock market I invest in Intel. That's just business. But I got on the AMD bandwagon big-time when they got to 64-bit first and I stuck with them all the way through the Phenom II.
Now, at this moment, Intel's SandyBridge has the best value and AMDs bulldozer is quite far behind, so new purchases for me right now are Intel. That may change in the next year or two and when it does my new purchases will happily be in the AMD camp again. Frankly, AMD only has to get within shouting distance (~8%) of Intel and I will happily use AMD. AMD doesn't have to beat Intel.
I think there are a number of things AMD can do right now to compete better with Intel. One of the biggest is in the mini-server department (albeit clearly with lower volumes than their current focus on laptops & integrated graphics). AMD consumer cpus (aka Phenom II) always had ECC support but very few motherboards actually supported it, which made it difficult to use AMD for mini-servers and avoid the Intel Xeon tax to get ECC. If AMD worked on the mobo vendors to ALWAYS support an ECC option that would allow them to compete against Intel Xeons on price, even if they are unable to compete on performance.
On the opterons AMD clearly has the right idea going with high-core-count cpus, but the memory subsystem is lagging too much to really be able to make use of all those cores. That seems to be low-hanging fruit to me, something which should be readily addressable by AMD. The opterons still have a lot of value and potentially can have a radical improvement in value with Bulldozer, but only if AMD can push the core count and improve the memory subsystem.
On large multi-core boxes AMD also needs to improve CMPXCHG and other atomic instructions in situations where contention is high. Right now multi-chip opteron systems seriously lag Intel on contended latency due to cache coherency inefficiencies. Will Bulldozer fix those latency issues? I don't know.
AMD only needs to get within shouting distance of Intel for me to buy their chips, and work their mobo producers a bit more to get better overall support for their chip's capabilities. They don't have to beat Intel.
The pushes and pops involved are for call-saved registers, not for arguments. Over the years GCC has kinda flip-flopped over the best way to handle that... whether to use PUSH and POP or to use SUB/MOV/MOV/MOV/... the MOV sequences produce much longer instructions, so if you are space-concious (e.g. -Os), you are more likely to get PUSH/POP.
Intel and AMD cpus, over the years, have been better or worse at optimizing instructions which adjust the stack pointer. These days PUSH/POP sequences should be as fast as MOV sequences... maybe slightly slower fully cached but they'd get it back with reduced L1 instruction cache misses. I haven't done any exhaustive testing, however. Modern cpus can have so many instructions in-flight at once that simple non-dependent sequences such as PUSH/PUSH/PUSH or MOV/MOV/MOV are generally going to not bottleneck anything.
What's really amusing is that I've been on the scene for so long if you google my name 'Matthew Dillon', the first entry is actually... me! And not the actor(s). I'm sure that grinds a bit but I do bask in the occasional fan mail reaching my inbox, just before I hit the 'delete' key.
In recent years its started to flip back and forth, and I expect Hollywood will again take over the top spot after things die down again:-)
Since the cat is out of the bag some further clarification is required so I will include some more of the email I received. I didn't quite mean for it to explode onto the scene this quickly, but oh well.
Again, note that this is *NOT* an issue with Bulldozer. And they will have a MSR workaround for earlier models.
>> quote "AMD has taken your example and also analyzed the segmentation fault and the fill_sons_in_loop code. We confirm that you have found an erratum with some AMD processor families. The specific compiled version of the fill_sons_in_loop code, through a very specific sequence of consecutive back-to-back pops and (near) return instructions, can create a condition where the processor incorrectly updates the stack pointer.
AMD will be updating the Revision Guide for AMD Family 10h Processors and the Revision Guide for AMD Family 12h Processors, to document this erratum. In this documentation update, which will be available on amd.com later this month, the erratum number for this issue will be #721. The revision guide will also note a workaround that can be programmed in a model-specific register (MSR)."
end quote
They go on to document a specific workaround when the MSR is not programmed, which is basically to add a nop for every five pop+return instructions (though I'm not sure if the nop must occur between sequences or within the sequence). I will note that just the presence of 5xPOP + RET does not trigger the bug alone, it requires a very specific set of circumstances setup prior to that (that gcc's fill_sons_in_loop() procedure was able to trigger when gcc 4.7.x was compiled -O, when compiling particular.c files).
As I said, this bug was very difficult to reproduce. It took a year to isolate it and find a test case that would reproduce it in a few seconds. Until then it was taking me upwards of 2 days to reproduce it on a 48-core and much longer to reproduce it on a 4-core.
Since the bug was stack pointer address is sensitive the initial stack randomization that DragonFly does multiplied the time it took to reproduce the bug. But without the stack randomization the bug would NOT reproduce at all (I would never have observed it in the first place). In otherwords, the bug was *very* stack address sensitive on top of everything else.
I was ultimately able to improve the time it took to reproduce the bug by pouring over all my previous buildworld runs and finding the.c files that gcc had compiled that were most statistically likely for gcc to seg-fault in. Then once I isolated the files I iterated all possible starting stack offsets and eventually managed to reproduce the bug within 10 seconds using a gcc loop (10-20 gcc runs on the same file).
Changing the stack offset by a mere 16 bytes and the bug went away completely. The one or two particular stack offsets that reproduced the bug could then be further offset in multiples of 32K and still reproduce the bug at the same rate. Using a later version of gcc and the bug disappeared. Compiling with virtually any other options (turning on and off optimizations)... the bug disappeared.
On the bright side, I thought this was a bug in DragonFly for most of last year and set about 'fixing' it, and wound up refactoring most of DragonFly's VM system to get rid of SMP bottlenecks and making it perform much better on SMP in the face of a high VM fault rate. So even though we wound up not doing the 2.12 release the eventual 3.0 release (that we just put out recently) has greatly improved cpu-bound performance on SMP systems.
AMD has indicated to me that the Bulldozer is not effected, which is a relief.
I guess I should have realized this would get slashdotted. In anycase, it took quite a bit of effort to track the bug down. It was very difficult to reproduce reliably. It isn't a show stopper in that it really takes a lot of work to get it to happen and most people will never see it, but it's certainly a significant bug owing to the fact that it can be reproduced with normal instruction sequences.
I began to suspect it might be a cpu bug last year and after exhaustive testing I posted my suspicions in December:
Older versions of GCC were more prone to generate the sequence of POP's + RET, coupled with a deep recursion and other stack state, that could result in the bug. It just so happened that DragonFly's buildworld hit the right combination inside gcc, and even then the bug only occurred sometimes and only one a small subset of.c files being compiled (like maybe 2-3 files). The bug never manifested anywhere else, doing anything else, running any other application. Ever.
In particular the bug disappeared with later versions of GCC and disppeared when I messed with the optimizations. We use -O by default, not -O2. The bug disappeared when I produced code with gcc -O2 (using 4.4.7).
It is really unlikely that Linux is effected... the sensitivity to particular code sequences laid out in the compiler is so fine that adding a single instruction virtually anywhere could make the bug disappear. Even just shifting the stack pointer a little bit would make it disappear.
In anycase, for a programmer like me being able to find an honest-to-god cpu bug in a modern cpu is very cool:-)
Every open-source filesystem to-date has had serious pitfalls. Very serious pitfalls. In the Linux space it comes down to either significant bugs under heavy loads or extremely poor performance. I don't use Linux in production myself but I have several friends that do and they have yet to find any solution that doesn't occasionally explode in their faces. People talk about a lot of these linux filesystems as if they were the best thing since sliced bread but that's really only on paper. Every linux filesystem to-date has had and still has serious issues... everything from pseudo-commercialization or licensing to serious bugs when pushed... it's a mess.
In the BSD space there is basically no viable choice other than HAMMER1 (DragonFly) or ZFS (FreeBSD). And, no, I don't consider UFS w/softupdates and logging (let alone 'background fsck' or its very limited snap features) to be a viable choice.
HAMMER1 and ZFS also have serious deficiencies. For HAMMER1 its excessive seeking to access meta-data. For ZFS its excessive kernel memory use and the need for a lot of tuning to match the workload (and good luck with mixed workloads). With UFS you begin to hit major issues the instant kern.maxvnodes is hit, or the moment the directory hash cache limit is reached.
For DragonFly users, HAMMER1's meta-data issue is fairly easily solved. One big lesson we learned was that it doesn't actually take a whole lot of cache to cache the meta-data for even a modestly large filesystem (~several terrabytes), so DragonFly's generic swapcache feature coupled with a small SSD solves the meta-data problem very neatly. DragonFly also doesn't have a maxvnodes issue for caching purposes with the HAMMER+SSD combination and it solves it WITHOUT having to integrate the SSD into the filesystem like ZFS does. I learned a number of other lessons from HAMMER1 as well, particularly when it came down to the level of sophistication required to manage HAMMER1's B-Tree and the vulnerability (for any filesystem) of depending too much on the free block map.
However, even with all the features HAMMER1 has (automatic fine-grained history, trivial snapshots, trivial streaming incremental backups, etc)... it couldn't get us to our goal.
HAMMER2 is going to give us numerous additional features while at the same time solving the limitations of HAMMER1 that prevented it from being easily extended to cluster setups. HAMMER2 will have all the features of HAMMER1 plus also writable snapshots, multi-branching snapshots, a copies mechanism that ought to work considerably better than ZFS's, larger checks (up to 192 bits), block compression, a better de-dup implementation, and numerous other features. Plus it will be better matched for the clustering features we want. And, on top of all of that, HAMMER2's code base is actually going to be less complex than HAMMER1's code base was.
The biggest lesson learned from the HAMMER1 work is that meta-data is easy to cache, even for super-huge filesystems. We are taking advantage of that realization to greatly simplify the allocation scheme to make snapshot management & features utterly trivial to implement. Most free space management will be disconnected from production access paths (reading AND writing).
For our production systems it depends 100% on the actual amount of duplicated data, since bulk data reads are needed to verify the duplication. The number of passes is almost irrelevant because they primarily scan meta-data N times, not bulk data (duplicated bulk data only has to be verified once).
The meta-data can be scanned much more quickly than the verification of duplicated bulk data because the meta-data is laid out on the physical disk fairly optimally for the B-Tree scan the de-dup code issues. So meta-data can be read from the hard disk at 40 MBytes/sec even without the use of a SSD to cache it. Of course, with DFly's swapcache and the meta-data cached on the SSD that scan runs at 200-300 MBytes/sec.
But in contrast, the bulk reads used to validate the duplicate data just aren't going to be laid out linearly on the disk. There's a lot of skipping around... so the more actual duplicate data we have the larger the percentage of the disk's surface we have to read to verify it.
This is an area which I could further optimize in HAMMER's dedup code. Currently I do not sort the bulk data block numbers when running the data verification pass. Not only that but I am scanning a sorted CRC list, so the bulk data offsets are going to be seriously unsorted. Doing so would definitely improve performance, probably quite a bit, but still not be anywhere near the 40 MBytes/sec the meta-data scan can achieve off the platter. It would not be a whole lot of programming, probably a day to do that. Currently isn't at the top of my list though.
What this means, in summary (and even with semi-sorting of the bulk data blocks), is that one can use a bounded amount of ram without really effecting the efficiency of the off-line de-duplication.
Well, I can tell you why the option is there... it's not because of collisions, it's there to handle the case where there is a huge amount of actual duplication where the blocks would verify as perfect matches. In this case the de-duplication pass winds up having to read a lot of bulk-data to validate that the matches are, in fact, perfect, which can take a lot of time verses only having to read the meta-data.
Just on principle I think it's a bad idea to just trust a checksum, cryptographic hash, CRC, or whatever. Corruption is always an issue... even if the filesystem code itself is perfect and even if the disk subsystem is perfect there is so much code running in a single address space (i.e. the KERNEL itself) that it is possible to corrupt a filesystem just from hitting unrelated bugs in the kernel.
Not to mention radiation flipping a bit somewhere in the cpu or memory (even for ECC memory it is possible to get corruption, but the more likely case is in the billions of transistors making up a modern cpu, even with parity on the L1/L2/L3 caches).
Hell, I don't even trust IP's stupid simple 1's complement checksum in HAMMER's mirroring protocols. Once during my BEST Internet days we had a T3 which bugged out certain bit patterns in a way that actually got past the IP checksum... we only tracked it down because SSH caught it in its stream and screamed bloody murder.
If you de-duplicate trusting the meta-data hash, even a big one, what you can end up doing is turning 9 good and 1 corrupted copies of a file into 10 de-duped corrupted copies of the file.
I'm sure there are many data stores that just won't care if that happens every once in a while. Google's crawlers probably wouldn't care at all, so there is definitely a use for unverified checks like this. I don't plan on using a cryptographic hash as large as the one ZFS uses any time soon but being able to optimally de-dup with 99.9999999999% accuracy it's a reasonable argument to have one that big.
For on-line de-duplication the most optimal case in my view is to only de-dup data which may already be present in the buffer cache from prior recent operations, so the on-line dedup only maintains a small in-kernel-memory table of recent CRCs. This catches common operations such as file and directory tree copying fairly nicely.
The off-line dedup catches everything using a fixed amount of memory and multiple passes (if necessary) on the meta-data, then bulk data reads only for those blocks which appear to be duplicates to verify that they are exact copies.
I've run dedup on a 2TB backup from a VM with as little as 192MB of ram and it works. A more preferable setup would be to have a bit more memory, like a gigabyte, but more importantly to have a SSD large enough to cache the filesystem meta-data. A 40G SSD is usually enough for a 2TB filesystem. That makes the off-line dedup quite optimal and also makes other maintainance and administrative operations on the large filesystem, such as du, find, ls -lR, cpdup, even a smart diff... let alone rsync or other things one might want to run... it makes all of that go screaming fast without having to waste money buying a bigger system or waste money on excessive energy use.
Another side note on DragonFly's HAMMER: de-duplication is implemented both as a daily pass AND can also be enabled for live writes. The daily pass can find all duplicate blocks. The live dedup uses a small fixed in-kernel-memory LRU style record of recent data block CRCs to find de-duplication candidates during live writes. Performance impact is minimal either way as recently recorded CRCs also tend to still have their data in the buffer cache.
The live-dedup mostly exists to get some up-front deduplication when someone, say, does a 'cp' or 'cp -r' or something like that. The real catch-all is the daily pass.
One interesting side effect of having de-duplicated backups is that we don't have to make a huge effort to avoid duplicate data in developer shell accounts. Developers have tons of git repos and fully checked out source trees all over the place and it doesn't bloat our backups all that much. This makes developers lives easier too as they just don't have to worry about having lots of copies of things laying around.
Plus we are also backing up multiple machine's filesystems to the same backup filesystem and there's a lot of duplication on each machine which gets de-duplicated since the backups are all going to one target filesystem. It's a great feature just for that. I'm getting something like a 3.5:1 de-duplication ratio on our current aggregated backups. 4-5 TB of data winds up collapsing to around ~700G or so on the backup system, without compression.
Well, as usual the story is more complex than that. The drive makers have recovered from the floods but tried to keep prices high artificially by holding back inventory. Now their inventory levels are eating them alive so I expect prices to ease soon.
They had a good run but at the end of the day they have to produce volume and if demand doesn't keep up prices have to go down to compensate. Pretty simple.
-Matt
Well, kinda typical of the retail investor sentiment but you probably don't have to sue anyone. If it happened just as you said, and you didn't mess around too much afterwords other than try to sell what you had, then your broker will have a record of it. Contact your broker so you are in the queue (your broker is probably handling thousands of complaints already). Your broker will probably negotiate compensation with the NASDAQ on behalf of all of its customers, including you, and this is your best bet is to get the refund through your broker and not try to bypass them.
Just be sure you make a formal complaint. If you don't you could wind up missing out.
Suing... despite what you hear in the press that's always a last resort, and generally does not yield results for the actual abused clients that they had hoped for.
When the internet crash occurred circa ~2000 there were literally thousands of lawsuits, and hundreds made class-action status. For the next TEN YEARS I would, every so often, receive a letter from a lawyer that included a check for my portion of some class action settlement or other... all in all, I've received about 5-7 checks over the years. Most of them I just threw away... literally pennies on the dollar, not even worth cashing. Another couple I returned to the Judge with a note saying how worthless this was to the supposed victims and he could keep the money. One or two of them were large enough amounts to deposit and buy a meal at a nice restaurant with. And that's about it. Getting satisfaction from a lawsuit (as a passive litigant in my case, simply by being part of the class), is basically impossible.
That all said, I'm also not in a particularly charitable mood (as the other poster indicated). You clearly intended to simply day-trade the stock and hope for a quick bump, yah? You were gambling, not investing. Otherwise you wouldn't have minded holding on 'for the long term'. Nobody was holding a gun to your head. Though the NASDAQ screw-up was unprecedented, these are the kinds of risks you take when you get involved in IPOs.
-Matt
AOL's problem was that they depended on users dialing directly into AOL's service... i.e. they went from dialup provider + portal to just portal and from there to irrelevance. Google and MySpace had little to do with it... the internet outgrew AOL's (and Yahoo's) model.
Lets not forget Netscape... same thing. Browser -> Portal -> irrelevance.
-Matt
Don't invest in things you can't value. Pretty simple answer.
In anycase, I think you might be thinking of Amazon, not Google. Google has been an incredible investment pretty much throughout. It's value was cut in half during the 2008 crash but it recovered almost completely to its pre-crash high in just a year and a half. A year and a half worth of 'swoon' isn't a big deal for a real investor.
Amazon, on the otherhand, which also did well out of the box, rode the internet hype all the way to the internet crash and then didn't recover to its pre-internet-crash heights for nearly another 10 years (until the 2008 crash), then dived, and finally recovered for good in late 2009. Now, of course, Amazon is doing a lot better, but having to sit on an investment for almost 10 years post-internet-crash definitely counts as a bad bet.
Apple is in a similar situation now, with people prognosticating its health or death based on short-term (a few months) movement of its stock. The jury is still out. But today's world is a much different world than the internet boom and bust was.
-Matt
That's easy. I'm a small investor and haven't been impeded by the big guys at all. They have provided opportunity after opportunity for buying and selling since the crash.
Most of the people on the boards I frequent (80% of which are retired in well into their 80's, by the way), also haven't had any real trouble with the big guys. So if you are going to argue that you are stupider than a bunch of 80+ year old farts you are putting yourself into a corner. I'm a 45 year old not-quite-an-old-fart-yet and some of those guys do a better job than me (and I'm pretty good).
This is another one of those media-hyped stories that just isn't true. People are afraid of the markets, but they're afraid because the talking heads are telling them to be afraid, not because there is actually anything to fear.
To be fair, I would say that not everyone has what it takes to be an investor. There is volatility, it is possible to lose money... and the shorter-term view you have of the market the more money you can lose. And, unfortunately, most people (particularly younger people) have a very, very short-term view of the market. The vast majority of retail investors these days don't actually 'invest'. They (a) day-trade when they think they are investing and (b) don't have any real savings to invest with anyway. For that matter, people tend to not understand the vast, vast, VAST economic risks they take just having credit card debt.
I can give you an endless number of examples of this but perhaps the easiest to understand is to ask why people didn't invest during the crash once the Dow went below 9000. If you look at the graph in hind-sight, even though the Dow dipped well below 7000 before eventually bottoming, the period of time it spent below 9000 was only around half a year.
During that period people basically stopped thinking, believed in the end-of-the-world stories, and lost out on one of the biggest bull runs in history. And you didn't even have to predict the bottom to do it... even investing at 9000 with the market still dropping another 30% before bottoming... those people made out like bandits simply by being patient.
Nobody has patience these days, and that is a very bad fit for actually being able to become a good investor.
-Matt
Generally speaking (and ignoring FB which I've already commented on)... but generally speaking this is NOT true. The small guy actually has the advantage in this market, which makes it ironic that the small guys have mostly abandoned it.
The big guys have been fighting amongst themselves since the crash and it has created lots of opportunities for smaller retail investors to find really excellent entry points. Simply put, the reduced liquidity in the market gives the advantage over to the smaller players whos trades don't move stocks while the bigger ones get stuck fighting each other.
It used to be that 'dumb money'... a euphemism for the 'retail investor', gave the markets enough liquidity to allow the bigger players to enter and exit positions without excessively moving stock prices. These days with the big boys playing against each other and reduced liquidity it's more a matter of one big boy outwitting another because their trades move the underlying stocks too much. The small guys can take advantage of the much more obviously oversold conditions to buy, and overbought conditions to sell. The big guys can't.
The problem that a lot of retail investors have is that they don't actually know how to invest... they think they are investing when they are actually just day-trading. They pile into dangerous spaces that have already built up momentum to the upside instead of buying when they were low. For example, smaller players are STILL piling into the muni/govt bond markets even as we speak despite the huge risks involved as the Fed QE2 ends. Most retail investors sell during the inevitable pullbacks in these spaces (instead of selling during the rise), or buy well after a security has risen (instead of when it was closer to the bottom and still falling). They believe the crap that is fed to them by the media, believe the hype, believe the stories written by 13 year olds or guys with fancy titles and obvious conflicts of interest, and don't bother reading the financials of the companies they invest in or even listen in on the conference calls.
It doesn't take all that much work to actually invest properly, it just takes a bit of patience and a minimum of a medium term view (instead of a short-term reactionary view). The best investors in this market aren't the idiots who day-trade, it's the people who might do one or two small trades a week, maximum, slowly working long-term positions and collecting dividends while the big boys rattle the market back and force and provide the great entry and exit points.
The deck just isn't stacked against us, people only believe it is.
-Matt
Correction, the updated guidance (that the analyst based his revised opinion on) occurred several weeks ago... so investors have even LESS of a reason to complain. This wasn't news. And, again, just because an analyst reduces their opinion based on reduced guidance doesn't change the fact that it was still just an opinion, one based on information that had already been widely disseminated weeks earlier.
So... I don't particularly like MS and I don't particularly think they did a good job, but good luck trying to blame them for this mess. People have only themselves to blame if they bought into this IPO.
-Matt
Actually, MS came out with a statement indicating that the all IPO members (both retail and institutional investors) received updated guidance during the roadshow via a revision to the S1, and that the pricing of the IPO included that guidance. The analyst opinion was simply reflective of the revised guidance.
You'd have to be pretty stupid to assume that analysts wouldn't revise their opinions based on the change in guidance.
Well, you'd have to be pretty stupid to participate in the IPO in the first place, let alone invest in the stock. The thing was overpriced, the talking heads said it was overpriced, a simple high school math calculation would tell you it was overpriced, most people KNEW it was overpriced... and bought it anyway hoping for another 'sure bet' circa the internet frenzy leading up to the internet crash circa ~2000.
In some respects this is a good thing, it brings a much needed dose of reality to fuzzy-brained armchair investors.
If you want to complain about something you can complain about the NASDAQ screwing up the opening and not providing trade confirmations for 3+ hours to investors whos money was locked up and who could only watch the price start to drop without knowing whether they even owned shares, or being able to sell.
-Matt
It isn't that easy. Small buffers actually result in a high level instability that can leave them completely empty as often as it leaves them completely full. If the buffer is too small performance goes completely to hell on a permanent basis. It becomes highly chaotic.
There are plenty of situations where you want a bigger buffer to handle a very short-term burst of information, but where that same big buffer becomes a liability when the information is continuously in excess of available bandwidth.
Similarly there are plenty of situations where a small buffer gives you very nice low latencies... but only for the packets that manage to make it through the router. That same small buffer will also drop way too many packets in numerous situations. Worse, the buffer is too small for other protocols at the edges to be able to react to changes in the situation. No protocol can react instantly but a small buffer pretty much requires all packet sources to react instantly. It just doesn't work.
So big buffers have latency issues and small buffers have stability issues.
In anycase, neither of you two are even remotely close to being right. Having large buffers and 'solving' the latency issue by reordering packets is not an implementation, it's a desire. Anyone who has dealt with network backlogs knows that no matter how big your buffer is incoming packets can always fill it to the brim when output is unable to keep up with input. So at some point, no matter what, you have to drop packets. Simply dropping based on a time deadline is a horrible solution... that's even worse than RED. Assuming that you never have to drop anything is also a horrible solution because that implies fair-queuing at the center of the network which, due to the number of simultaneous connections running through the center of the network requires effectively infinite buffer space... a nice concept, but no realistic.
So, huge buffers with fancy latency-based algorithms don't fix the problem. And small buffers don't fix the problem either.
-Matt
It's a part of the solution but not a complete solution. Boundary problems are different from center-of-network problems. Playing with TCP window sizes only works well at the edges and only in the outgoing direction (the 'inflight' sysctls that we've had forever), but does not completely solve the problem because you always need to add one or two additional packets above and beyond the calculated sweet spot to absorb changes in latency from other links and give the algorithm time to respond whenever reality changes.
This solution in both incoming and outgoing directions suffers from a connection multiplication problem. That is, it works fine if you have only a few simultaneous TCP connections running but it breaks down when you have dozens or hundreds due to the need to have 1-2 extra packets of slop in the reported window. It degrades gracefully in the outgoing direction but blows up very quickly when you are trying to control bandwidth in the incoming direction by changing the window size you report available in the outgoing ACKs (anyone running torrents can tell you this but the problem occurs for any busy network).
However, bandwidth limiting in this fashion DOES reduce packet backlogs and queues significantly. Not enough (it's simply impossible to make the algorithm stable at the sweet spot so you always need 1-2 additional packets), but significantly. Hence it is part of the solution.
Step 2: On speed boundary changes you have to run fair queuing, period. Not only that but you have to do it on both sides of the boundary (i.e. in both directions each at the choke point). In my case I could never get truly reliable operation by only running fair queuing in the outgoing direction. I had to actually run the fair queue on *both* ends of the link, meaning I had to colocate a server to serve as the terminus for a VPN and run all the traffic over the VPN so I could control both ends. The fair queue also has to reserve bandwidth for pure TCP ACKs to prevent restricting the bandwidth in the opposite direction due to ACK starvation.
Fair queuing takes care of all remaining packet buffering issues at the edges of the network.
Center-of-network issues can't use the above solutions simply because there are too many connections flowing through the center of the network to track and not enough packet buffer space to sufficiently buffers all those connections for fairq operation (N x B is just too big). It's impossible to calculate where the choke points are based on connection tracking at the center-of-network. Easy at the edges, impossible in the center.
So the center-of-network has to provide additional congestion control through some sort of AQM, and if early-warning requests go unheeded it must start dropping packets. Personally speaking I hate the idea of having to drop packets, it leads to all sorts of problems everywhere, but the simple fact of the matter is that there is not enough per-connection buffer space at the center of the network to run fairq, nor is it an appropriate place for that. Without tracking you cannot mess with tcp window sizes in the center-of-network and even if you could track you can't bundle the connections based on where the choke point is at the moment, there is simply not enough information.
So we are talking at least three and probably closer to half a dozen different mechanisms being needed to reduce network latencies and still provide good and fair performance.
-Matt
I run a VPN to a colo so I have full control over the packet stream in both direction. I use PF on both sides with fair-queue (DragonFly of course), service separation, a separate channel for pure acks, etc. Works great, actually. I gang the VPN across both COMCAST and U-VERSE and tend to run full-out in both directions for long periods of time (I can even run it over 3G but my Android phone crashes pretty quickly when I do that.. oh well).
My cable bill has been about the same for the last 5 years, the only difference now is that the 'internet' portion of the bill is larger and the channel portion (I'm now down to basic cable) is smaller. But COMCAST is still getting their hunk of flesh out of me.
The COMCAST link has been the most reliable, and their bandwidth controls are fairly predictable. Even with the highest-speed internet plan they offer I can only count on around 2MBytes/sec downlink on a continuous basis, and around 250 KBytes/sec uplink. The only real issue I've had w/COMCAST is that their cable modem insists on assigning a non-routable IP when the physical link is down, but that was easily solved with a 'reject 192.168.100.1;' line in /etc/dhclient.conf.
I also have AT&T U-Verse... my advise, stick with COMCAST or, if you absolutely have to use U-Verse don't bother with the static IPs. Basically AT&T doesn't know what they are doing when it comes to providing internet access. Their U-Verse crap uses MAC based filtering and can't handle multiple IPs behind a router, and it loses its mind every once in a while. The only semi-reliable way to connect to it is via a hard port on the WIFI router they supply through NAT. Basically you have to use their NAT service and WIFI router (though you can use a hard port on the router) to get anything even remotely reliable, and you can't do any fancy MAC filters because their WIFI router forgets about them every so often and breaks you. Fortunately the VPN runs over the NAT service just fine.
Sustained downlink and uplink rates with U-Verse are about 50% what I can get with COMCAST, only about 1 MByte/sec downlink and only around 150 KBytes/sec uplink with their highest rated service. If I go any higher I hit long periods of time where their link can't sustain the bw and packets start to build up on their routers instead of mine (where I can't control them), killing ping times.
I had long conversations with two sets of AT&T techs. The second guy knew what he was doing and cleaned up the copper to absolute perfection, and I can check the stats on the short-haul DSLx2 lines (that just go to the corner of the street), so these bw and reliability issues are not related to my twisted pairs.
They (AT&T) are lying if they say you get more than that in any sort of sustained manner. It's limited by their upstream... the physical link can handle more and but even though I don't use the U-Verse TV service their hardware still reserves uplink bw for it, which is annoying as hell, and their uplink and backbone is clearly under-provisioned.
Still, at least AT&T is creating some competitive pressure on COMCAST. Not a whole lot, but some. I wish Verizon had fiber in my area.
I've noticed a few other issues with U-Verse. AT&T's backbone hits log-jams every so often, drops a lot more packets than COMCAST just generally, and appears to choose routes to various places that go through problem-prone backbone infrastructure. It's quite annoying.
I had a DSL line for a while too, but uplink speeds from my location are a joke and the pricing is no longer competitive for the meager amount of bw I can get. It was reliable, but couldn't deliver the bw.
--
Insofar as streaming T.V. goes, it works pretty well here. NetFlix, Hulu, other apps. Strangely enough it works better through my VPN than it does directly, probably because neither COMCAST or AT&T can do content-aware filtering of an encrypted VPN's UDP stream (and ganging aside). I mostly use Apple products &
Just in time since 60TB hard drives are just around the corner. ...
Oops.
-Matt
There was a list of keywords the CIA was known to filter on, so we'd often just insert them randomly into postings so they'd get read by some poor overworked CIA analyst.
This should be fun!
-Matt
Yah, I get those two mixed up all the time, and will continue to probably for the rest of my life. On the bright side people know it's actually me doing the posting when they read that and a few other grammatical mistakes that I often make.
-Matt
Once in the late 1990's we had a weird bug where FTPing or RCPing a particular file between two offices would often result in a corrupt file on the other end. We kept scratching our heads trying to figure out what could possibly be corrupting the file. FTPing it anywhere else succeeded... no corruption. Everything else between the offices seemed to work ok.
It wound up being a hardware issue with the T3 between the two offices. The hardware would corrupt the bitstream in a manner that tended to PASS the TCP/IP checksum, resulting in corrupted data. It required a particular pattern of 1's and 0's for the bitstream to be corrupted in a manner that passed the checksum, which this particular file happened to have.
These days, of course, I use scp to transfer files whenever possible. SSH will detect that sort of corruption and fail with a protocol error. Encryption has certain uses beyond just encrypting the data, it seems!
-Matt
Intel has had quite a few serious chip bugs too, all in errata. A number of new cpu bugs in both AMD and Intel chips always appears in new generations, but both companies have very large test suites and the number of new bugs goes down in every generation.
Don't forget that Intel had to recall a sandybridge chipset early in the sandybridge cycle, which cost them something like a billion dollars because the related motherboards had to be thrown away and replaced. That was due to internal on-chip circuitry related to a SATA port burning out.
Right at this moment AMD has two issues facing it in order to compete on workstations: (1) Power and (2) Performance. Their initial bulldozer release clearly depends too much on compiler optimizations to make full use of the architecture. They will clearly have to bulk-up some of the simplifications they made that made their cpu cores a little too sensitive to instruction sequences generated by compilers and I hope their next few releases will do better.
On power consumption it comes down to the Fab as much as anything else. Their dependence on the Fab is clearly a problem and they've made a break for it to try to solve it, even though it is costing them dearly. At the same time Intel has made some major advances in their three fabs, to the point where Intel can do their entire production on just two of those three fabs now but they decided to keep the third fab because they think they can 'grow into' it.
So AMD definitely has some work ahead of it, and I am hoping they reserve some of their focus for the high-end and don't concentrate entirely on laptops. I always like to say that I love AMD, but in the stock market I invest in Intel. That's just business. But I got on the AMD bandwagon big-time when they got to 64-bit first and I stuck with them all the way through the Phenom II.
Now, at this moment, Intel's SandyBridge has the best value and AMDs bulldozer is quite far behind, so new purchases for me right now are Intel. That may change in the next year or two and when it does my new purchases will happily be in the AMD camp again. Frankly, AMD only has to get within shouting distance (~8%) of Intel and I will happily use AMD. AMD doesn't have to beat Intel.
I think there are a number of things AMD can do right now to compete better with Intel. One of the biggest is in the mini-server department (albeit clearly with lower volumes than their current focus on laptops & integrated graphics). AMD consumer cpus (aka Phenom II) always had ECC support but very few motherboards actually supported it, which made it difficult to use AMD for mini-servers and avoid the Intel Xeon tax to get ECC. If AMD worked on the mobo vendors to ALWAYS support an ECC option that would allow them to compete against Intel Xeons on price, even if they are unable to compete on performance.
On the opterons AMD clearly has the right idea going with high-core-count cpus, but the memory subsystem is lagging too much to really be able to make use of all those cores. That seems to be low-hanging fruit to me, something which should be readily addressable by AMD. The opterons still have a lot of value and potentially can have a radical improvement in value with Bulldozer, but only if AMD can push the core count and improve the memory subsystem.
On large multi-core boxes AMD also needs to improve CMPXCHG and other atomic instructions in situations where contention is high. Right now multi-chip opteron systems seriously lag Intel on contended latency due to cache coherency inefficiencies. Will Bulldozer fix those latency issues? I don't know.
AMD only needs to get within shouting distance of Intel for me to buy their chips, and work their mobo producers a bit more to get better overall support for their chip's capabilities. They don't have to beat Intel.
-Matt
The pushes and pops involved are for call-saved registers, not for arguments. Over the years GCC has kinda flip-flopped over the best way to handle that... whether to use PUSH and POP or to use SUB/MOV/MOV/MOV/... the MOV sequences produce much longer instructions, so if you are space-concious (e.g. -Os), you are more likely to get PUSH/POP.
Intel and AMD cpus, over the years, have been better or worse at optimizing instructions which adjust the stack pointer. These days PUSH/POP sequences should be as fast as MOV sequences... maybe slightly slower fully cached but they'd get it back with reduced L1 instruction cache misses. I haven't done any exhaustive testing, however. Modern cpus can have so many instructions in-flight at once that simple non-dependent sequences such as PUSH/PUSH/PUSH or MOV/MOV/MOV are generally going to not bottleneck anything.
-Matt
What's really amusing is that I've been on the scene for so long if you google my name 'Matthew Dillon', the first entry is actually... me! And not the actor(s). I'm sure that grinds a bit but I do bask in the occasional fan mail reaching my inbox, just before I hit the 'delete' key.
In recent years its started to flip back and forth, and I expect Hollywood will again take over the top spot after things die down again :-)
-Matt
Since the cat is out of the bag some further clarification is required so I will include some more of the email I received. I didn't quite mean for it to explode onto the scene this quickly, but oh well.
Again, note that this is *NOT* an issue with Bulldozer. And they will have a MSR workaround for earlier models.
>> quote
"AMD has taken your example and also analyzed the segmentation fault and the fill_sons_in_loop code. We confirm that you have found an erratum with some AMD processor families. The specific compiled version of the fill_sons_in_loop code, through a very specific sequence of consecutive back-to-back pops and (near) return instructions, can create a condition where the processor incorrectly updates the stack pointer.
AMD will be updating the Revision Guide for AMD Family 10h Processors and the Revision Guide for AMD Family 12h Processors, to document this erratum. In this documentation update, which will be available on amd.com later this month, the erratum number for this issue will be #721. The revision guide will also note a workaround that can be programmed in a model-specific register (MSR)."
end quote
They go on to document a specific workaround when the MSR is not programmed, which is basically to add a nop for every five pop+return instructions (though I'm not sure if the nop must occur between sequences or within the sequence). I will note that just the presence of 5xPOP + RET does not trigger the bug alone, it requires a very specific set of circumstances setup prior to that (that gcc's fill_sons_in_loop() procedure was able to trigger when gcc 4.7.x was compiled -O, when compiling particular .c files).
As I said, this bug was very difficult to reproduce. It took a year to isolate it and find a test case that would reproduce it in a few seconds. Until then it was taking me upwards of 2 days to reproduce it on a 48-core and much longer to reproduce it on a 4-core.
Since the bug was stack pointer address is sensitive the initial stack randomization that DragonFly does multiplied the time it took to reproduce the bug. But without the stack randomization the bug would NOT reproduce at all (I would never have observed it in the first place). In otherwords, the bug was *very* stack address sensitive on top of everything else.
I was ultimately able to improve the time it took to reproduce the bug by pouring over all my previous buildworld runs and finding the .c files that gcc had compiled that were most statistically likely for gcc to seg-fault in. Then once I isolated the files I iterated all possible starting stack offsets and eventually managed to reproduce the bug within 10 seconds using a gcc loop (10-20 gcc runs on the same file).
Changing the stack offset by a mere 16 bytes and the bug went away completely. The one or two particular stack offsets that reproduced the bug could then be further offset in multiples of 32K and still reproduce the bug at the same rate. Using a later version of gcc and the bug disappeared. Compiling with virtually any other options (turning on and off optimizations)... the bug disappeared.
On the bright side, I thought this was a bug in DragonFly for most of last year and set about 'fixing' it, and wound up refactoring most of DragonFly's VM system to get rid of SMP bottlenecks and making it perform much better on SMP in the face of a high VM fault rate. So even though we wound up not doing the 2.12 release the eventual 3.0 release (that we just put out recently) has greatly improved cpu-bound performance on SMP systems.
-Matt
AMD has indicated to me that the Bulldozer is not effected, which is a relief.
I guess I should have realized this would get slashdotted. In anycase, it took quite a bit of effort to track the bug down. It was very difficult to reproduce reliably. It isn't a show stopper in that it really takes a lot of work to get it to happen and most people will never see it, but it's certainly a significant bug owing to the fact that it can be reproduced with normal instruction sequences.
I began to suspect it might be a cpu bug last year and after exhaustive testing I posted my suspicions in December:
http://leaf.dragonflybsd.org/mailarchive/kernel/2011-12/msg00025.html
Older versions of GCC were more prone to generate the sequence of POP's + RET, coupled with a deep recursion and other stack state, that could result in the bug. It just so happened that DragonFly's buildworld hit the right combination inside gcc, and even then the bug only occurred sometimes and only one a small subset of .c files being compiled (like maybe 2-3 files). The bug never manifested anywhere else, doing anything else, running any other application. Ever.
In particular the bug disappeared with later versions of GCC and disppeared when I messed with the optimizations. We use -O by default, not -O2. The bug disappeared when I produced code with gcc -O2 (using 4.4.7).
It is really unlikely that Linux is effected... the sensitivity to particular code sequences laid out in the compiler is so fine that adding a single instruction virtually anywhere could make the bug disappear. Even just shifting the stack pointer a little bit would make it disappear.
In anycase, for a programmer like me being able to find an honest-to-god cpu bug in a modern cpu is very cool :-)
-Matt
Every open-source filesystem to-date has had serious pitfalls. Very serious pitfalls. In the Linux space it comes down to either significant bugs under heavy loads or extremely poor performance. I don't use Linux in production myself but I have several friends that do and they have yet to find any solution that doesn't occasionally explode in their faces. People talk about a lot of these linux filesystems as if they were the best thing since sliced bread but that's really only on paper. Every linux filesystem to-date has had and still has serious issues... everything from pseudo-commercialization or licensing to serious bugs when pushed... it's a mess.
In the BSD space there is basically no viable choice other than HAMMER1 (DragonFly) or ZFS (FreeBSD). And, no, I don't consider UFS w/softupdates and logging (let alone 'background fsck' or its very limited snap features) to be a viable choice.
HAMMER1 and ZFS also have serious deficiencies. For HAMMER1 its excessive seeking to access meta-data. For ZFS its excessive kernel memory use and the need for a lot of tuning to match the workload (and good luck with mixed workloads). With UFS you begin to hit major issues the instant kern.maxvnodes is hit, or the moment the directory hash cache limit is reached.
For DragonFly users, HAMMER1's meta-data issue is fairly easily solved. One big lesson we learned was that it doesn't actually take a whole lot of cache to cache the meta-data for even a modestly large filesystem (~several terrabytes), so DragonFly's generic swapcache feature coupled with a small SSD solves the meta-data problem very neatly. DragonFly also doesn't have a maxvnodes issue for caching purposes with the HAMMER+SSD combination and it solves it WITHOUT having to integrate the SSD into the filesystem like ZFS does. I learned a number of other lessons from HAMMER1 as well, particularly when it came down to the level of sophistication required to manage HAMMER1's B-Tree and the vulnerability (for any filesystem) of depending too much on the free block map.
However, even with all the features HAMMER1 has (automatic fine-grained history, trivial snapshots, trivial streaming incremental backups, etc)... it couldn't get us to our goal.
HAMMER2 is going to give us numerous additional features while at the same time solving the limitations of HAMMER1 that prevented it from being easily extended to cluster setups. HAMMER2 will have all the features of HAMMER1 plus also writable snapshots, multi-branching snapshots, a copies mechanism that ought to work considerably better than ZFS's, larger checks (up to 192 bits), block compression, a better de-dup implementation, and numerous other features. Plus it will be better matched for the clustering features we want. And, on top of all of that, HAMMER2's code base is actually going to be less complex than HAMMER1's code base was.
The biggest lesson learned from the HAMMER1 work is that meta-data is easy to cache, even for super-huge filesystems. We are taking advantage of that realization to greatly simplify the allocation scheme to make snapshot management & features utterly trivial to implement. Most free space management will be disconnected from production access paths (reading AND writing).
-Matt
For our production systems it depends 100% on the actual amount of duplicated data, since bulk data reads are needed to verify the duplication. The number of passes is almost irrelevant because they primarily scan meta-data N times, not bulk data (duplicated bulk data only has to be verified once).
The meta-data can be scanned much more quickly than the verification of duplicated bulk data because the meta-data is laid out on the physical disk fairly optimally for the B-Tree scan the de-dup code issues. So meta-data can be read from the hard disk at 40 MBytes/sec even without the use of a SSD to cache it. Of course, with DFly's swapcache and the meta-data cached on the SSD that scan runs at 200-300 MBytes/sec.
But in contrast, the bulk reads used to validate the duplicate data just aren't going to be laid out linearly on the disk. There's a lot of skipping around... so the more actual duplicate data we have the larger the percentage of the disk's surface we have to read to verify it.
This is an area which I could further optimize in HAMMER's dedup code. Currently I do not sort the bulk data block numbers when running the data verification pass. Not only that but I am scanning a sorted CRC list, so the bulk data offsets are going to be seriously unsorted. Doing so would definitely improve performance, probably quite a bit, but still not be anywhere near the 40 MBytes/sec the meta-data scan can achieve off the platter. It would not be a whole lot of programming, probably a day to do that. Currently isn't at the top of my list though.
What this means, in summary (and even with semi-sorting of the bulk data blocks), is that one can use a bounded amount of ram without really effecting the efficiency of the off-line de-duplication.
-Matt
Well, I can tell you why the option is there... it's not because of collisions, it's there to handle the case where there is a huge amount of actual duplication where the blocks would verify as perfect matches. In this case the de-duplication pass winds up having to read a lot of bulk-data to validate that the matches are, in fact, perfect, which can take a lot of time verses only having to read the meta-data.
Just on principle I think it's a bad idea to just trust a checksum, cryptographic hash, CRC, or whatever. Corruption is always an issue... even if the filesystem code itself is perfect and even if the disk subsystem is perfect there is so much code running in a single address space (i.e. the KERNEL itself) that it is possible to corrupt a filesystem just from hitting unrelated bugs in the kernel.
Not to mention radiation flipping a bit somewhere in the cpu or memory (even for ECC memory it is possible to get corruption, but the more likely case is in the billions of transistors making up a modern cpu, even with parity on the L1/L2/L3 caches).
Hell, I don't even trust IP's stupid simple 1's complement checksum in HAMMER's mirroring protocols. Once during my BEST Internet days we had a T3 which bugged out certain bit patterns in a way that actually got past the IP checksum... we only tracked it down because SSH caught it in its stream and screamed bloody murder.
If you de-duplicate trusting the meta-data hash, even a big one, what you can end up doing is turning 9 good and 1 corrupted copies of a file into 10 de-duped corrupted copies of the file.
I'm sure there are many data stores that just won't care if that happens every once in a while. Google's crawlers probably wouldn't care at all, so there is definitely a use for unverified checks like this. I don't plan on using a cryptographic hash as large as the one ZFS uses any time soon but being able to optimally de-dup with 99.9999999999% accuracy it's a reasonable argument to have one that big.
-Matt
Yes, this is correct.
For on-line de-duplication the most optimal case in my view is to only de-dup data which may already be present in the buffer cache from prior recent operations, so the on-line dedup only maintains a small in-kernel-memory table of recent CRCs. This catches common operations such as file and directory tree copying fairly nicely.
The off-line dedup catches everything using a fixed amount of memory and multiple passes (if necessary) on the meta-data, then bulk data reads only for those blocks which appear to be duplicates to verify that they are exact copies.
I've run dedup on a 2TB backup from a VM with as little as 192MB of ram and it works. A more preferable setup would be to have a bit more memory, like a gigabyte, but more importantly to have a SSD large enough to cache the filesystem meta-data. A 40G SSD is usually enough for a 2TB filesystem. That makes the off-line dedup quite optimal and also makes other maintainance and administrative operations on the large filesystem, such as du, find, ls -lR, cpdup, even a smart diff... let alone rsync or other things one might want to run... it makes all of that go screaming fast without having to waste money buying a bigger system or waste money on excessive energy use.
-Matt
Another side note on DragonFly's HAMMER: de-duplication is implemented both as a daily pass AND can also be enabled for live writes. The daily pass can find all duplicate blocks. The live dedup uses a small fixed in-kernel-memory LRU style record of recent data block CRCs to find de-duplication candidates during live writes. Performance impact is minimal either way as recently recorded CRCs also tend to still have their data in the buffer cache.
The live-dedup mostly exists to get some up-front deduplication when someone, say, does a 'cp' or 'cp -r' or something like that. The real catch-all is the daily pass.
One interesting side effect of having de-duplicated backups is that we don't have to make a huge effort to avoid duplicate data in developer shell accounts. Developers have tons of git repos and fully checked out source trees all over the place and it doesn't bloat our backups all that much. This makes developers lives easier too as they just don't have to worry about having lots of copies of things laying around.
Plus we are also backing up multiple machine's filesystems to the same backup filesystem and there's a lot of duplication on each machine which gets de-duplicated since the backups are all going to one target filesystem. It's a great feature just for that. I'm getting something like a 3.5:1 de-duplication ratio on our current aggregated backups. 4-5 TB of data winds up collapsing to around ~700G or so on the backup system, without compression.
-Matt