Maybe I've not yet had enough caffeine this morning - and this is bound to get moderated down (and might even cost me some karma too), but:-
[mild rant] I find it crazy that/. run this story (which, let's face it is about a cartoon series on TV), yet only a couple of weeks ago/. kicked up such a fuss about NOT wanting to run a story on the falling prices of tech stocks.
I don't even own a TV (and ain't done so for some years now) - which (I know) makes me have a slightly different world-view and value-set to most geeks - but this is not the D&D Role-Playing Game that we used to play, it is just a cartoon......
And whilst not many of us own tech stocks, when the whole tech industry looses so much (perceived) market value in such a short space of time it could affect our lives.... [/mild rant]
>> This is only possible if the tables are incredibly denormalized.
Well, they are quite denormalised, yes!
>> Are you perhaps saying that you only expose some specially created denormalized tables for reporting purposes?
No, the whole app architecture really is built around SQL queries with no joins. By doing this it is more effecient for our file-caching of db queries - you end up caching a smaller amount of files, as opposed to the many more files which would essentially containing the same data as one another if you were caching the results from a query involving a join. (That doesn't read very well - I hope you understand what I'm trying to say!)
At the app level, we do do things that are effectively functionally identical to an SQL join. We do all of our db access through a layer of code that makes the caching invisible, and this layer also deals with the 'joins'.
Can't say too much, coz we're only days from going live with a major e-commerce site & we're under NDA - but here's what I (think) I can say that may be helpful:-
The system we are deploying on is capable of sustaining a delivery of more than 230 dynamic pages per second (which we have benchmarked using RSW's test suite) - around 20million dynamic pages in a 24 hour period.
We are using Compaq PCs and each is fitted with twin PIII @ 500MHz, 1GB RAM, 9Gb SCSI local drive (mirrored) and they are running WinNT 4. All networking is 100Mb, doubled up cards and cables (for fault tolerance). Our hosting ISP (big and in the UK) is responsible for maintenance, support, (main/primary) backup and hardware configuration.
The system is isolated from anything but port 80 requests by a pair of Cisco firewalls, and all the kit plugs into a Cisco Catalyst 5505 switch. Behind the wall sits a BigIP box (for fault tolerance and load balancing - we've found BigIP kit to be very good, BTW) which passes HTTP requests to a pair of identical web/application servers (with the above config) that are both running IIS 4.0.
The backend is connected to the front via a couple of hubs, and consists of a grand total of four boxes (again with the above config) running M$-SQL 7.0. Using Compaqs own clustering solution, these are linked up as two buddy pairs for 24x7 fail-over. Each pair of db servers appears as one virtual SQL server and each pair of db servers shares a 186Gb RAID5 array.
We are using a proprietary scripting engine, which interfaces as an ISAPI filter and currently only runs under NT (hence the choice of M$ technology throughout). This is similar in architecture/functionality to PHP or ASP - although it outperforms both.
Our application architecture allows us to split our db across two db servers (careful db schema design allowing load distribution). Furthermore all of our SQL queries are very atomic (no joins!) and the query results are then turned into script files and saved into a (db query) file cache on the RAID.
Before we implemented caching of db queries as files, the db access was the bottleneck with throughput. Since implementing the query cache we can run at full tilt with our db servers ticking over at well under 10% load. We are using pooled persistent connections to our dbs also.
Our script (embedded in HTML) is compiled into byte code on first access and written out as a file (similar to the stuff Zend are gonna do / are doing). This improves execution speed greatly after the first access.
All files that are accessed by the script engine (which doesn't include the images/etc) are cached into memory. All the application code is stored on the RAID and is accessible to all web servers in the cluster - the performance hit due to accessing a file over the network is only incurred first read (due to the local memory caches on the web servers script engines).
When a db row is changed, the script file in the cache is rewritten and the all web servers are told to refresh the file cached in memory. (Mostly, our app reads from the db - writing to the db is quite infrequent compared to reading, hence this works very well for us).
Searching the site is all done through a table of keywords (again this table is file cached and then memory cached) - so we do not incur a performance hit from free-text searching of our pages.
The system is fully scaleable simply through the addition of extra web/application servers should the need arise (which I'm sure it will, one day!). As previously stated, the actual real world load on our db servers is negligible so we should be ok there for a while. If necessary, we can also move from 100Mb Cat5 up to optic-fibre if networking becomes the bottleneck.
All the CNN survey showed was that: within the small selection of the programmers studied, there are definite differences in programmers' coding styles which in this study, manifested itself as US programmers writing less lines of code in a given period of time.
Which means absolutely diddly-squat in the really-real world.
I'm a UK coder, and I would like to point out that our US coder's style results in a lot more compound statements being used than I would write myself. Sometimes this may result in code that is harder to follow/debug.
You need more metrics than LoC to evaluate the 'productivity' of a programmer - and I think the true formula must involve some sort of measurement such as: pizza's eaten in office / time;0)
The only thing you really get by measuring LoC is a minimum figure for how many times the enter key was pressed.
[mild rant] /. run this story (which, let's face it is about a cartoon series on TV), yet only a couple of weeks ago /. kicked up such a fuss about NOT wanting to run a story on the falling prices of tech stocks.
I find it crazy that
I don't even own a TV (and ain't done so for some years now) - which (I know) makes me have a slightly different world-view and value-set to most geeks - but this is not the D&D Role-Playing Game that we used to play, it is just a cartoon......
And whilst not many of us own tech stocks, when the whole tech industry looses so much (perceived) market value in such a short space of time it could affect our lives....
[/mild rant]
(Maybe I'm just getting old...?)
fRoGG
Well, they are quite denormalised, yes!
>> Are you perhaps saying that you only expose some specially created denormalized tables for reporting purposes?
No, the whole app architecture really is built around SQL queries with no joins. By doing this it is more effecient for our file-caching of db queries - you end up caching a smaller amount of files, as opposed to the many more files which would essentially containing the same data as one another if you were caching the results from a query involving a join. (That doesn't read very well - I hope you understand what I'm trying to say!)
At the app level, we do do things that are effectively functionally identical to an SQL join. We do all of our db access through a layer of code that makes the caching invisible, and this layer also deals with the 'joins'.
HTH,
Jim
The system we are deploying on is capable of sustaining a delivery of more than 230 dynamic pages per second (which we have benchmarked using RSW's test suite) - around 20million dynamic pages in a 24 hour period.
We are using Compaq PCs and each is fitted with twin PIII @ 500MHz, 1GB RAM, 9Gb SCSI local drive (mirrored) and they are running WinNT 4. All networking is 100Mb, doubled up cards and cables (for fault tolerance). Our hosting ISP (big and in the UK) is responsible for maintenance, support, (main/primary) backup and hardware configuration.
The system is isolated from anything but port 80 requests by a pair of Cisco firewalls, and all the kit plugs into a Cisco Catalyst 5505 switch. Behind the wall sits a BigIP box (for fault tolerance and load balancing - we've found BigIP kit to be very good, BTW) which passes HTTP requests to a pair of identical web/application servers (with the above config) that are both running IIS 4.0.
The backend is connected to the front via a couple of hubs, and consists of a grand total of four boxes (again with the above config) running M$-SQL 7.0. Using Compaqs own clustering solution, these are linked up as two buddy pairs for 24x7 fail-over. Each pair of db servers appears as one virtual SQL server and each pair of db servers shares a 186Gb RAID5 array.
We are using a proprietary scripting engine, which interfaces as an ISAPI filter and currently only runs under NT (hence the choice of M$ technology throughout). This is similar in architecture/functionality to PHP or ASP - although it outperforms both.
Our application architecture allows us to split our db across two db servers (careful db schema design allowing load distribution). Furthermore all of our SQL queries are very atomic (no joins!) and the query results are then turned into script files and saved into a (db query) file cache on the RAID.
Before we implemented caching of db queries as files, the db access was the bottleneck with throughput. Since implementing the query cache we can run at full tilt with our db servers ticking over at well under 10% load. We are using pooled persistent connections to our dbs also.
Our script (embedded in HTML) is compiled into byte code on first access and written out as a file (similar to the stuff Zend are gonna do / are doing). This improves execution speed greatly after the first access.
All files that are accessed by the script engine (which doesn't include the images/etc) are cached into memory. All the application code is stored on the RAID and is accessible to all web servers in the cluster - the performance hit due to accessing a file over the network is only incurred first read (due to the local memory caches on the web servers script engines).
When a db row is changed, the script file in the cache is rewritten and the all web servers are told to refresh the file cached in memory. (Mostly, our app reads from the db - writing to the db is quite infrequent compared to reading, hence this works very well for us).
Searching the site is all done through a table of keywords (again this table is file cached and then memory cached) - so we do not incur a performance hit from free-text searching of our pages.
The system is fully scaleable simply through the addition of extra web/application servers should the need arise (which I'm sure it will, one day!). As previously stated, the actual real world load on our db servers is negligible so we should be ok there for a while. If necessary, we can also move from 100Mb Cat5 up to optic-fibre if networking becomes the bottleneck.
Just my 2 pence worth - HTH.
Jim
> I really, really, really hope Sega wasn't stupid enough to hard wire the modem on the motherboard.
FYI, The modem on the Dreamcast is a removable pack on the side of the main unit.
Here in the UK they're releasing the DreamCast with a 33.6 modem initially, with an upgrade planned in the near future.
They're calling it DPL - Digital PowerLine.
They claim that bandwidth will be up to 1Mb, degrading slightly under peak load.. :0)
Now that'd be nice!
Which means absolutely diddly-squat in the really-real world.
I'm a UK coder, and I would like to point out that our US coder's style results in a lot more compound statements being used than I would write myself.
Sometimes this may result in code that is harder to follow/debug.
You need more metrics than LoC to evaluate the 'productivity' of a programmer - and I think the true formula must involve some sort of measurement such as: ;0)
pizza's eaten in office / time
The only thing you really get by measuring LoC is a minimum figure for how many times the enter key was pressed.
Jim
software engineer
ubik.net