scottsevertson · Slashdot Mirror

Re:Make working with XML suck less... on The Future of XML · 2008-02-08 05:03 · Score: 1

I agree that some of the WS-* extensions are extremely useful, but only if you actually need the functionality/complexity.

The services I write are consumed by developers at educational institutions around the world, many using nothing more than a quick-and-dirty PHP script. They appreciate the simplicity and ease of use - our read-only resources can be explored directly inside a web browser, and I point folks to the RestTest Firefox plugin if they need to experiment with our writable services.

However, I would hardly define our web services as simple... We've just done a lot of design work to make them appear simple. Through careful definition of resources, well thought out URL schemes that are human readable/constructible, and liberal use of XLink to point out related resources, we've avoided much of need for the complexity of the WS-* world.

With regard to WS-Security and WS-Trust - I've read and appreciated their specs before. Both appear well intentioned and well designed. However, the rest of the WS-* stack is so distasteful that it taints the gems that can be found. I'd rather take the concepts behind the specs, and implement the minimal subset my application needs. For example, see Amazon's REST authentication mechanism - simple yet effective.

As for WS-* interop, the base specifications are so complex that a complete web service toolkit is required to even contemplate interoperability. Even worse are the slight incompatibilities between major vendors - Microsoft's generated WSDLs for .NET services don't always map well to IBM's stack, for example (at least as of 2005).

I may have worded my initial post too strongly. Here's when I would consider WS-*

Internal web services consumed by other resources at a single organization
Homogeneous development platform across servers and clients, or enough time to work around minor interop issues
A demonstrated need for more than a few WS-* extensions

Make working with XML suck less... on The Future of XML · 2008-02-07 17:06 · Score: 4, Interesting

"XML is really just data dressed up as a hooker."
--Dave Thomas

XML does suck if you stick with some of the W3C standards and common tools. Suggestions to make it less painful:

Ditch W3C's XML Schema
W3C Schema is painful; it forces object-oriented design concepts onto a hierarchical data model. Consider RELAX NG (an Oasis-approved standard) instead; it's delightful in comparison. Use the verbose XML syntax when communicating with the less technical - if you've seen XML before, it's pretty easy to comprehend:

<r:optional> <r:element name="w3cSchemaDescription"> <r:choice> <r:value>painful</r:value> <r:value>ugly</r:value> <r:value>inflexible</r:value> </r:choice> </r:element> </r:optional>

Switch to the compact syntax when you're among geeks:

element w3cSchemaDescription { "painful" | "ugly" | "inflexible" }?

There's validation support on major platforms, and even a tool (Trang) to convert between verbose/compact formats, and output to DTD and W3C Schemas. And, if you need to specify data types, it borrows the one technology W3C Schema got right: the Datatypes library.
Don't use the W3C DOM
The W3C DOM attempts to be a universal API, which means it must conform to the lowest common denominator in the programming languages it targets. Consider the NodeList interface:

interface NodeList { Node item(in unsigned long index); readonly attribute unsigned long length; };

While similar to the native list/collection/array interfaces most languages provide, it's not an exact match. So, DOM implementers create an object that doesn't work quite like any other collection on the platform. In Java, this means writing:

for(int i = 0; i < nodeList.length(); i++) { Node node = nodeList.item(i); // Do something with node here... }

Instead of:

for(Node node : nodeList) { // Do something with node here... }

Dynamic languages allow an even more concise syntax. Consider this Ruby builder code to build a trivial XML document:

x.date { x.year "2006" x.month "01" x.day "01" }

I thought about writing the W3C DOM equivalent of the above, but I'm not feeling masochistic tonight. Sorry.
The alternatives depend on your programming language, but plenty of choices exist for DOM-style traversal/manipulation.
Forget document models entirely (maybe)
In-memory object models of large XML document can consume a lot of resources, but often, you only need part of the data. Consider using an XMLPull or StAX parser instead. Pull means you control the document traversal, only descending into (and fully parsing) sections of the XML that are of interest. SAX based parsers have equivalent capabilities, but the programming model is uncomfortable for many developers.
Even better, some Pull processors are wicked fast, even when using them to construct a DOM. In Winter 2006, I benchmarked an XML-heavy application, and found WoodStox to be an order of magnitude faster at constructing thousands of small DOM4J documents

Earthlink doesn't think it affects them on Time Warner Cable Implements Packet Shaping · 2007-06-10 10:17 · Score: 3, Informative

Just chatted with an Earthlink Sales-Bot:

Andy P.: Thank you for using EarthLink's live Sales chat. How can I help you today?
Scott: I'm considering switching to Earthlink Cable from Time Warner Cable, but I'm wondering if TWC's newly announced packet shaping policy will be affecting Earthlink customers? See http://www.dslreports.com/forum/remark,18468495~da ys=9999~start=100 for some details regarding their announcement.
Andy P.: One moment while I get that information for you.
Andy P.: No, this does not affect us.
Scott: How sure of of that answer are you? No offense, but I don't want to subscribe, then later find out you were wrong.
Andy P.: The Topic on the Forum itself says "TW Officially Announces Packet Shaping for All RR User" It does not mention EarthLink and If this was the case with us we would definitely have received an update on this by now.
Scott: Thanks! Appreciate your time.

Could be the news hasn't trickled down to Sales, but I guess I'm hopeful. Only other option here is DSL, which has a higher total cost if you don't already have a phone line.

Re:I might respect Microsoft on Why Microsoft Won't List Claimed Patent Violations · 2007-05-14 14:22 · Score: 1

> If nothing else, you can hold up Excel as a shining example of excellence in software.

Excel? You're really holding that up as "excellent"? Excel is the software that:
* Redefines Leap Years but touts it as a feature [http://web.archive.org/web/20030926042409/support .microsoft.com/default.aspx?scid=kb;EN-US;q181370] ? (Note: Archive.org link, as Microsoft seems to have moved the original KB article).
* Prevents you from opening more than one file with the same name, even in separate directories (can't find citation, but try it)?
* Until version *12* couldn't handle more that 256 columns [http://blogs.msdn.com/excel/archive/2005/09/26/47 4258.aspx]?
* Breaks the fundamentals of cut/copy-and-paste?

09 V4 8G 57 BK SD DT GG AM OL HL D2 60 on Censoring a Number · 2007-05-01 08:04 · Score: 2, Informative

I wonder if they'll be searching for the number in different forms... Like base 32?

Aside: looks like *someone* killed the Digg story that included the number after a ROT-13 transform (http://digg.com/tech_news/A_useful_copyrighted_st ring_use_the_linked_URL_to_get_your_desired_target ). Anyone want to place bets on whether Digg preemptively killed that story versus received a takedown notice? I'm guessing the former.

Experience on Are Open Source Reporting Tools Ready for Primetime? · 2006-03-08 16:48 · Score: 3, Informative

JasperReports has worked well for me in many situations, and the report file format (XML) is easy enough to work with, even without a designer front end. Performance in most use cases is respectable, and I didn't have to jump through a lot of hoops to get things running. Whenever I need "traditional" reporting, JasperReports is my first choice.

My current company is in the process of migrating away from Actuate e.Report, which has basically (but unofficially) been EOLed. For example, it'll run under Java 1.5, but every report logs an IOException during generation, and no fix is planned. Performance on large documents (read 1k+ pages) is unusable, and gets significantly worse as the page count increases (not exponentially, but worse than linearly). Oh, and don't plan on your users editing the RTFs it spits out - everything uses absolute positioning within the document, so the page doesn't reflow.

I spent some quality time with BIRT last month, but wasn't terribly impressed. Installation wasn't painless, and their underlying model assumes that your data is a flat, relational table. Our data is hierarchical in nature, and we would have had to either flatten it, or use tons of sub-reports to accomplish our goals. Additionally, the options for output format are pretty limited compared with other solutions.

We ended up setting with Windward Reports, for two main reasons:
1. They assume hierarchical data instead of relational.
2. Their design front end is any RTF editor, and produces editable RTF results (and can still output to HTML, PDF, etc).

Performance with Windward has been an order of magnitude better that e.Report in our worst cases, and they've been quite good about implementing minor new features that we needed.

A couple issues:
1. They're not open source, and are relatively pricey, especially when you're an Application Service Provider.
2. The code that is open (such as thier data adapters) has a strange license, and hasn't been actively cleaned up in a while. Their license requires that you submit any non-company-specific improvements made to a data adapter back to them.
3. Their documentation is not up to date with their latest feature set, so be prepared to look at change logs, or ask questions on their forums. On the other hand, their tech support has been excellent.

We considered a number of other innovative reporting solutions as well. Just make sure that the reporting solution you pick actually meets your data and user's requirements, and don't be afraid to look beyond the "standard" reporting systems if you have non-traditional needs.

Government code reviews on Third Party Code Review? · 2006-02-21 20:11 · Score: 3, Interesting

I contracted with an electronic voting systems company last summer; one task was preparing code for an audit as mandated by the FEC. This was to be a manual audit (versus an automated audit like Fortify), conducted by a 3rd party government contractor.

Notes from the experience:
* We requested examples of code that met specific auditing criteria, and received back several somewhat-anonymized methods, apparently taken from competitor's products. You should verify that the bank has appropriate "handling procedures" for protecting 3rd party source code.

* Our audit criteria was spelled out in an FEC ruling in decent detail. We found that 50% of the rules could be easily expressed as existing Checkstyle "checks" [http://checkstyle.sourceforge.net/ ]; it was pretty easy to build custom "checks" to catch another 30%. We then used an Eclipse plugin [http://eclipse-cs.sourceforge.net/ ] to get real-time highlighting of detected issues (plus Ant scripts for command-line checking).

In your case, Fortify "rulepacks" appear quite proprietary/complex, so using their product is probably your only option for pre-audit auditing. If licensing is out of the question, and you can't strike a cross-promotional bargain (i.e. you market with "Secured by Fortify", they use you as a case study, you get a discount), try and get access to the tool through the bank before the official audit, or negotiate an appropriately flexible window of time in which to address any discovered issues.

* You're not "innocent until proven guilty" in an audit. In *many* situations, we had to argue against rules that were nonsensical in Java, or false-positive issues discovered by the audit. Some we won, most we lost; we faced an uphill battle on all.

* Our auditor was apparently not fluent in Java, and flagged several issues regarding the method names on classes in java.lang.*. Be thankful for automated auditing :)

Good luck!

Re:Paper trail worthless unless voter verifiable on WI Assembly OKs Voting Paper Trail · 2005-11-11 20:46 · Score: 1

Sorry, missed the latest revision:

Previously: ...generates a complete paper ballot showing all
votes cast by each elector at the time that it is cast
Now: ...generates a complete paper ballot showing all
votes cast by each elector that is visually verifiable by the
elector before the elector leaves the machine

Previously: ...and that enables a manual recount
Now: ...and that enables a manual count or recount

The new text confirms visual verification *and* equal validatity with other ballots (otherwise, the paper record couldn't be used for a "count" otherwise, only a "recount")

Go Wisconsin! Now how about a requirement for publicly open source, and some validation of software versions?

Paper trail worthless unless voter verifiable on WI Assembly OKs Voting Paper Trail · 2005-11-11 20:38 · Score: 3, Insightful

Any paper trail is worthless unless each voter is able to verify the printed record, *AND* the printed record is considered equivalent to any other vote. The Wisconsin bill only requires that a paper record be produced, not that the voter can see it. Why is this so important? Because of the FEC source code review clusterfuck.

HAVA [Help America Vote Act] gives the FEC governance over electronic voting, including establishing source code review procedures for all machines used in a Federal election (read: all voting machines). However, there are so many flaws in the FEC review procedure that it's downright scarry.

1. Coding standards more concerned with technical compliance than correct function. Turns out, the coding standards say more about the correct format of a "for" statement, or the appropriate amount of boilerplate documentation per method, than they do about defining correct operation, error tollerance, or anything else.
2. FEC code review doesn't cover "libraries". Want to include malicous code that only kicks in on the appropriate date, with sufficient voting volume to bury aberation in the noise? Throw it in a library, and use it in the project. Want to be really sneaky? Rebuild an open source library, or some external piece like a database driver or print driver with your malicous code.
3. Fudging alowed in FEC testing. System can't stay stable enough to run 100,000 votes sequentially on a single machine? Throw in automatic application restarts at a set interval into your test harness backend; test harness code isn't reviewed.
4. No enforcement procedure to verify reviewed code is the code running on election day. Not even checksums are required to verify compiled libraries/assmblies/executables are the same as the day they were submitted for review.
5. Reviewer incompetence. FEC reviewers may not be familiar with the language being reviewed. One claimed unequivocally that "length" was a Java keyword, and as such, couldn't be used as a variable name (a glance at the Java spec confirms his mistake). Why? Since it was used without parens like a method call, it must be a keyword.
6. Bogus documentation passes inspection. Don't have all the required class/method/variable documentation for the 2002 standards? Write a comment generator, fix it up a little by hand, and you're set!

OK, so the coding review and coding standards suck. What's that have to do with the voter verifiable paper trail? Everything. Unless the voter can visually check the ballot (and ideally should have to "sign off on it" before the electonic vote is committed), what's to stop hidden/poorly reviewed code from altering the printout *AND* the electronic vode database?

What about the paper receipt being equivelent to a traditional paper ballot? Some voting legeslation only allows the paper ballot to be used for verification, not as a true ballot. So, while you may recount the paper trail, the numbers from the recount are not legally votes, and cannot be used to change the outcome of an election (a fact that would be gleefully used by the conveniently "winning" side in a contested election). The Wisconsin bill does not specify in this matter.

How can we do better? Take a look at the procedure recommended by the Open Voting Consortium http://www.openvotingconsortium.org/>. The *primary* representation of a vote is the printed paper ballot, with a machine readable representation output beside the human readable representation. After voting concludes, each paper ballot is scanned, and compared to the electronic count.

By the way, hope your voting machine vendor has valid source control procedures (like not using a single account for all checkins?), so a malicious contractor can't check in random changes to the code base/libraries. [Evil laughter...]

Re:Depends on the Coupling on Migrating Visual Basic Applications? · 2005-04-04 10:16 · Score: 5, Interesting

Even poor *quality* code can be *ported* easily to another platform/language, but only if the code is not highly coupled to the interface.

Case in point: I'm currently porting a portion of a PowerBuilder app [88k+ LOC] to Java. Fortunately, the code is not coupled to the interface (other than poping up message boxes for errors, due to PB's lack of exception handling).

My strategy? ~50 regular expressions to translate the syntax between PB and Java, plus ~30 classes emulating the PowerBuilder functions/libraries used by the code. The code quality is the same as it was before, but I had successful test cases running less than 3 weeks after starting the project (including developing the regular expressions and the support classes). If that's all you need, you can stop there.

I got away with this strategy because the code wasn't coupled to the interface. The code *was* tied heavily to the database, but it's much easier to mock up data access than it is to recreate a visual interface.

On the other hand, if you're talking about re-writing or cleaning up the code in the process, that's another story entirely. I've spent the past three months reworking the ported code to use Java idioms, decoupling it from the data layer, and refactoring the "almost re-typed with subtle code differences" sections; if the original code had been higher quality, the project would have been done two months ago.

Re:Spoke to Justin about this... on Ruby On Rails Showdown with Java Spring/Hibernate · 2005-04-04 06:44 · Score: 1

Did I even indicate they were working on "web pages"? OK, maybe I should have said "Ruby and/or RoR", but obviously both Java and Ruby can be applied outside of the web domain.

Re:Spoke to Justin about this... on Ruby On Rails Showdown with Java Spring/Hibernate · 2005-04-04 06:28 · Score: 1

LOL... It just ticked me off to see people questioning your Java implementation skills.

By the way, mind if I borrow a bit from your Secure Web Services talk? I'm doing Wisconsin WebSphere Users's Group in a couple weeks, and I'd like to nail them with the pillars: Confidentiality, Integrity, Authentication, Authorization, & Non-Repudiation. Credit given, of course :)

Re:Hibernate too hyped on Ruby On Rails Showdown with Java Spring/Hibernate · 2005-04-04 05:12 · Score: 2, Informative

iBATIS SqlMaps is my persistance *assistance* technology of choice, and have been for two years.

Notice *assistance* - it is not *meant* to compete with ORM solutions like Hibernate, which is why I use it. Ted Neward expresses it best: "Object-relational mapping is the Vietnam of Computer Science"

My take? Relational databases are procedural in nature, not object-oriented. Get over it, and stop trying to pretened otherwise! iBATIS SqlMaps makes writing relational database "method calls" very easy, and stays mostly out of the way.

Spoke to Justin about this... on Ruby On Rails Showdown with Java Spring/Hibernate · 2005-04-04 04:48 · Score: 5, Informative

At the Milwaukee No Fluff Just Stuff conference, I was invovled in a lunchtime conversation with Justin and [Pragmatic] Dave Thomas about this subject, just days after Justin completed the Ruby code.

The concensus at that point: it probably wasn't a difference in *execution* speed, but smarter data retreval strategies used by Rails persistance layer. While Hibernate has excellent support for lazy loading, both developers thought that Rails was being *lazier*.

Justin's new numbers also point to faster caching in RoR's persistance layer: while both applications performed about equally without pre-cached data, RoR performed 20x better than the Java stack with cached data [both versions using similar caching strategies].

As for those questioning Justin's java skills: he's one of the best programmer's I've had the privilege to know, one of the best speaker's I've listend to, and is freaking hilarious to boot. He's the co-author of O'Reily's Better, Faster, Lighter Java, and he regularly speaks on advanced Hibernate, Spring, and a bunch of other Java topics.

He also points out a *significant* decrease in Lines of Code[Java:3293 RoR:1164] and Lines of Configuration [Java:1161 RoR:113]. While not an accurate gauge of effort, it is another point in Ruby's favor.

Last point for Ruby: Every single *top notch* Java programmer I know is at least playing with Ruby and RoR, with a large percentage [>50%] transitioning to Ruby as a first choice for new project work.

Don't call it a toy until you've played with it. There's some pretty convincing evidence that Ruby/RoR can beat Java for development effort, and now we're seeing it can beat it for performance, too.

If your kids have Leapfrog products... on Leapfrog Talking Pen · 2005-01-12 08:02 · Score: 1

...you'll know it'll sorta work about 50% of the time. Relatives have given my 4-year-old three different toys from them; not one has worked consistantly.

In fact, the only thing reliable about their products is making my daughter cry after the toy crashes for the third time in five minutes!

Re:Sybase vs. everything else on Sybase Releases Free Enterprise Database on Linux · 2004-09-10 04:36 · Score: 1

Ouch, that hurt... I do feel like an idiot for misunderstanding allpages, but in my own defence:

1. I start off slamming every DB vendor out there. They all have their quirks, and Sybase seems to have it's fair share.

2. I know the difference between page and table locks, but I misunderstood what Sybase meant by "allpages". You have to admit it could be confused.

3. I've been using Sybase for less than 5 weeks... That's why I was thankful for the post from the 12 year veteran Sybase DBA correcting me. I learned something.

4. The list of issues I mentioned are what I've already experienced in those 5 weeks. Usually it takes me a couple months to be completly annoyed with a DB implementation.

5. I'm a contractor with very little control over the other 146 databases on the server. Those databases were created by about 20 different groups over the past 5-6 years, and many haven't been updated to use "new" features (like Data Only Locking) from 11.9+. Unfortunatly, we still need to get data from those databases, but we can't just go change locking schemes on a DB we don't own, with multiple legacy clients, which may even be external.

6. I'm the one now pushing for a cross-group policy on locking schemes and lock acquisition. Until that goes through, we're stuck in deadlock hell.

I know I'm probably responding to a troll, but I figured I should at least defend my honor :)

And I don't see anyone defending chained v. unchained modes or DDL inside transactions :)

Re:Sybase vs. everything else on Sybase Releases Free Enterprise Database on Linux · 2004-09-09 05:14 · Score: 1

Thanks for the correction. I misunderstood the Sybase docs:
> The pre-11.9.2 locking scheme continues to be
> supported; it is called allpages locking, and it
> is the default locking scheme when you first
> install or upgrade to version 11.9.2...

The allpages locking scheme name seems deceptive - it sounded to me as if it were all *pages*, hence the whole table.

We're having huge problems here recently with deadlocking; multiple development groups share the same databases, and there is no internal consensus on best practices with regards to table locking schemes, lock acquisition, etc. So, when someone else's mis-behaving query locks a significant portion of a shared table, our server starts spewing deadlock exceptions left and right.

Thanks again.

Sybase vs. everything else on Sybase Releases Free Enterprise Database on Linux · 2004-09-09 03:00 · Score: 2, Informative

Sybase sucks. Then again, so does every other database product I've ever worked with (Oracle 7i-9i, MS SQL Server 6.5-2000, Informix, Postgres 7.1 - 8.0, *and* MySql). They all just suck in different ways.

If you spend enough time with any product, you'll find little quirks that drive you insane. A couple off the top of my head for Sybase:

1. Chained & Unchained modes. Sybase supports a SQL 92-compliant transactional mode, and a hacked up "autocommit" mode, with optional transactional support. The hacked up mode is default, and the SQL 92-compliant mode has some severe limitations.

2. No DDL inside transactions. So what? That includes creating temporary tables. You want to call a stored proc from within a transaction? It better not touch the tempdb.

3. Table-level locking by default. This one just blows me away; Sybase didn't support row-level locking until somwhere around version 11, and table-level locking is still the default. If you're DBAs aren't on top of things, you'll have deadlocks all over the place. They still haven't enabled it for the system tables, so make doubly sure you don't do any long-running code that touches them, or you'll have deadlocks for sure.

If you think I'm bullshitting, check out a quasi-white paper (grey paper?) I posted a while ago to the iBATIS support forms - it's got a lot more detail about some of the problems, and some Java-based work-arounds: http://sourceforge.net/forum/message.php?msg_id=27 20775

So, would I choose Sybase over the competition? Maybe about 3 years ago, before other DBs got decent replication support - that was one of their claims to fame. Performance doesn't seem to be that big of an issue - hardware is often cheaper than engineering around db limitations.

If I had to rank what I've used in order of preference, it would be:

1. Postgres
Maybe because I've only used it for about 8 months, but Postgres has not *yet* disappointed me. Transaction support has been perfect, and no major performance problems. Then again, I haven't done any stored proc work, so maybe I should give it time.

2. MS SQL Server
I cringe to say this, but MS's developer tools push their DB up on my list. Query Analyzer has excellent "show plan" support, and their management tools are great. I'm generally pretty happy, although their JDBC driver could use some work, and DTS was pretty weak last time I used it.

3. Oracle
Cost knocks this one down a bit, and I'm a bit rusty as well. Last time I ran Oracle on Linux was shortly after it was released, and their install procedure was a *bitch*. However, nifty features like data partitioning were definitly worth the extra money.

4. Sybase
See above. It's decent, now it's free for small projects, but I'm annoyed.

5. Informix
I'm out of date on Informix, but I have bad memories, mostly of constantly overfilling the transaction logs, then having the DB stop working with an unclear error message. I understand the need for a DBA to monitor this on a production environment, but it was a pain in the ass in development.

6. MySql
OK, I'm going to get bashed on this one. The old limitations left a sour taste in my mouth, and too many critical features are brand new. I will reconsider, though, after it has a little more time to mature.

EAI/X3D on A New Chance For 3D On The Web? · 2000-10-04 22:52 · Score: 1

I have kind of a love/hate relationship with VRML - you can do some pretty incredible stuff with it, but the syntax is archane, and scripting support varies widly between browsers.

A little known feature of VRML97 is the inclusion of the EAI (External Authoring Interface), which lets you tie in Java objects (or C++, in some browsers), and use the objects to dynamically affect the environment. Some projects actually only have a really simple stub .wrl file, and then build the rest of the stuff from a database on the back end, bypassing the syntax difficulties.

Unfortunatly, the main reason that EAI hasn't revolutionized VRML and made it widely used is that not all browsers support it, and levels of compliance varies between the ones that do. Sound familiar (HTML, Java Applets, etc...)?

The next revision of VRML shows a lot of promise -it's an XML-based language called X3D (Yeah, I know, another damn X* acronym). Hopefully, by eliminating the need for specialized parsers, the browser writers will be able to concentrate more on spec compliance, rendering speed/quality, and cross-platform availability.

Right now, the best VRML97 browsers are Cosmo Player and ParallelGraphic's Cortona, but neither have Linux versions available (old versions of Cosmo are available for SGI). The spec for VRML97 and X3D are availble, and are surpisingly readable.

Scott Severtson
Applications Developer

Applications Design... on On Building High Volume Dynamic Web Sites · 2000-03-07 23:07 · Score: 2

A lot of people are putting forward ideas centered around hardware - there are a number of things you can do in your software to design for scalability:

Application Cache: Figure out what objects are constant across your entire application, and build a simple object cache that loads them up, holds them in memory, and has the ability to refresh the cache on a background thread.
NO SESSION STATE: This one is counter-intuitive. You would think that it would be a good idea to cache some objects on a session-by-session basis, but it's not! Think about it this way: An average user clicks about 1 time per minute. By NOT caching objects for that user on a high volume server, you can use the resources to serve a couple thousand more requests. Also, most servers use cookies to hold a session ID, which must be read on each request - turing off session globally on your server will dramatically increase your maximum throughput.
Scalability, NOT Performance: This is a trap that many developers fall into - they think that by optimizing how fast pages come up, they are creating a scalable site. What this translates into is optimizing for a single user's performance. The actual goal is a linear ramp up of response time based on the number of users. I've seen sites that returned pages in record time choke under a load of 20-30 simutaneous users because of this.
Session State: If you truly need session state, create a separate database or LDAP server, optimized for reads, to hold your session state. Don't use cookies to hold your session ID - use a GUID or something similar as a URL parameter. This lets you have session state, but allows users to transparently switch between Web servers on the front end, and thier Session follows them. Also, you are only loading up session state on the pages that need it.
Load Test: After every major change, load test your application, and watch for that linear ramp up. Capture statistics internally in your application about how effective your caching is, how much time is being spent on a particular operation, etc. Use these stats to figure out where to tune next.

I think that's about it. If you (or anyone else) has any more questions, feel free to email me @ scotty@auragen.com. I've been doing a lot of this stuff lately, and could probably give you some more pointers.

Scott Severtson
Applications Developer

I know all about this... on Ease of Use vs. Sweat Equity · 1999-11-23 20:19 · Score: 1

About two years ago, I was (unfortunatly) responsible for setting up and maintaining a NT Web Server... We ran into an assload of problems with Microsoft's Java Virtual Machine on the system, when calling it from the web server. The web server would run fine, but the JVM would fail to load every single time. I ended up reinstalling the box about half a dozen times, but to no success... I then started experimenting with the order I was installing things... It turned out that I had to install the (Microsoft written) video drivers next to last, and then install Option Pack 4 again to keep things from breaking! The correct install process went something like:

Install NT.
Finish hardware setup, except for video.
Install SP3.
Install IE5.
Install Option Pack 4
Install video drivers.
Re-install Option Pack 4.

I probably lost about 3 weeks of work on the damn box, but now it's as stable any NT box I've ever seen - it hasn't gone down in about 14 months. Note - I really do hate NT, but it can be a relatively good platform if you work out all the 8,000 kinks in getting it set up & deployed.

Scott Severtson
Applications Developer

Re:Creating a Kids' Website... on FTC Regulates Kids' Privacy Online · 1999-10-22 06:05 · Score: 1

I'm used to writing for clueless external clients, who tend to simply skim what I write. I've found that if my points don't jump out at them, they tend to get skipped over, often times resulting in countless headaches for me. It seems to work very well for technical reviews (search for "Protocol Specifications Review" on the page), because the important data/points immediately draw your eye.

That being said, it is quite possible that I went overboard on my other post ;)

And as to the debate about the origin of that style of writing, I did read MAD Magazine a little too much as a kid, but more recently, I have been looking at Jakob Nielsen's Alertbox column.

In reply to the rest of your comment:
If you and your employers had any ethics you would not be collecting the data in the first place.
Thank (insert deity of choice) that I don't work for that company - my company is a web design/programming agency, and were hired by an external client to create the site. I may have exaggerated the evilness of the client - parents (if they read the permission statement) are informed that their child may be contacted from time to time to participate in optional surveys. The client is actually a well respected polling company based in Rochester, NY, and they usually pretty straight up about things (other than a number of the employees being relatively clueless, but that's another rant...).

Moderator: Please mark this down as off topic.

Scott Severtson
Applications Developer

Creating a Kids' Website... on FTC Regulates Kids' Privacy Online · 1999-10-22 01:57 · Score: 4

I'm working on a kid's website for a client, and the FTC regulations aren't making my work any easier.

The client knew that these regulations have been coming for a while, so we have been actually dealing with these issues for the past two months.

Implementing the parental consent process has actually been quite easy. The hard part is trying to figure out out how to spin things so that the parents want to sign the consent form! Granted, it's kind of hard to put a spin on using kids for market research data.

The strategy we ended up using was the same concept drug sellers use - give the kids a free hit or two on the web site, but to keep using it, they have to register. Hopefully by then, they are addicted.

I'm not sure if I morally agree with the purpose of the web site (no it's not evil or anything, but I don't like tricking kids/parents into revealing their shopping habbits online), but it pays pretty well!

Scott Severtson
Applications Developer

Patent text, review on Patent Attempt on some forms of Dynamic Web Posting · 1999-05-24 02:45 · Score: 5

The patent is available at:

http://www.patents.i bm.com/details?pn=US05894554__&language=en

Review:

Things the patent DOESN'T cover: Load balancing with static content, hardware (i.e. router) based load balancing, round-robin load balancing.
The patent is designed around a single parent server, which passes off requests to children. Some load balancing implementations have all servers in the cluster take turns receiving requests, and passing off as needed, or use a round-robin DNS lookup, where requests come into a single DNS server (NOT a WEB SERVER, which the patent specifically states), and then pass out requests to the various web servers in the pool, without regard to current server load.
One possible infringer: Microsoft, which has built in new clustering technology in Windows 2000.

On a side note, they reference a patent from Oracle which seems to claim that they invented the idea of having a web server retrieve data from a database through a stored procedure. It seems anyone can get a patent for anything these days.

Scott Severtson
Software Developer
Auragen Communications
scotty@auragen.com

Hardware-Based Searching on New Search Engines · 1999-05-04 09:24 · Score: 5

The specs on the search engine are available at http://web.fast.no/product/search/d et.asp?id=34.

The press release doesn't exactly scream it out, but the search engine is actually just a little bit of software stuck on top of some pretty neat custom hardware. They call their chip the FAST PMC (Pattern Matching Chip), and their server is just your average (well, sort of average) high end server, with a buttload of those chips stuck on PCI cards.

The specs on the PMCs are available at http://web.fast.no/product/PMC/det.asp ?id=52.

FAST claims 100 MB/sec throughput on each chip, and each card has its own RAM (from 8 MB to 2 GB). The chips actually run at 100 mHz each, and even have support for RegEx matching (slightly limited).

From the specs:
A typical configuration will contain 4 to 8 plug-in cards per search node, and 16 or 32 chips on each card.

Overall, I'm pretty impressed - putting search capabilities into hardware is a pretty good idea, especially since so much of a modern processor is geared toward things like Floating Point calculations, which doesn't help text searching at all.

Scott Severtson
Software Developer
Auragen Communications
scotty@auragen.com

Slashdot Mirror

User: scottsevertson

Comments · 31