Slashdot Mirror


Obtaining Multi-Tier Application Logs for Reseach?

arohann asks: "I'm a research assistant in a well-known university in the US. As part of the research work my group is doing, we need access to the logs from a production system of an n-tier web-application. I've been looking around for a while with no result. Most places reply with a flat 'No!'. I was wondering if there anyone who could help/advise with this. Please read about our requirement below and do let me know if you can help?" "We want to examine the request arrival behaviour of a real-world web-application and will also need to examine how long each request takes to be processed at each tier. We would collect this data over a few days and then use it to build a real-world model of the request behaviour of an internet application. This model would be used in our analysis and profiling of clustered, multi-tier, internet applications.

Of course, we realize it maybe that some of this data cannot be shared due to client privacy concerns. However, let me assure you that we are not interested in any client details and we're not particularly concerned with what kind of an application it is as long as its at least 3-tier, is a production system (we need a real-world model), and is used daily. We are also willing to sign a confidentiality agreement if necessary and follow any company protocol required to ensure that security and confidentiality are preserved.

Of course, if this results in any research paper publications, we would give credit to the supplier of the data.

Hoping to hear back from everyone soon ;)"

7 of 40 comments (clear)

  1. My Suggestions, by colemanguy · · Score: 3, Insightful

    My guess is your gonna need to try to contact them using something other then email, probably some sort of ceritifed letter.

  2. Talk to research friendly companies by jhoger · · Score: 4, Informative

    I'd start with companies that are already offering internships through your university. Find professors and graduate students that already have a working relationship with private sector folks and get introduced through them.

    Just cold calling or sending in letters or email is about as effective as you've found it to be.

    Also you should try looking through published artcles in trade journals and find out which companies are sponsoring research in your field by association with existing published research.

    The fact is that you'll certainly have to sign an NDA and likely they will have to scrub the data anyway. One way or another it's going to cost the donors $$$ that you aren't going to reimburse. Your project will have to fit in with their research goals or they'll be returning a favor from someone else.

    -- John.

  3. Use your professors by deranged+unix+nut · · Score: 3, Insightful

    Your best bet would be to have your professors call in a favor from former students or their contacts in the industry.

    Most companies will consider this to be a security risk. They don't even want you to know the rough design of their backends let alone collect data from it.

    Some companies wouldn't know how to gather what you want and wouldn't risk letting you touch their systems.

    Most of these systems are probably messy, kludged together by former employees and hacked by current employees just enough to keep them running.

    If you have some time, get an internship and do your research on the side. :)

  4. What's it worth to the supplier? by stienman · · Score: 2, Informative
    I'm a research assistant in a well-known university in the US. As part of the research work my group is doing, we need access to the logs from a production system of an n-tier web-application.

    Welcome to capitalism, we hope you enjoy your stay. While here, please note that TANSTAAFL.

    Asking for data from a business requires a lot of work on your part. You must somehow convince them that all the effort they are going to spend collecting, sanitizing, and providing you with the information is going to pay off for them in a reasonable way. Since this request involves several months of data, and more employee involvement than a 5 minute survey you'll have to build a strong relationship with a company who has this data.

    Opportunites include:
    • The research will help you identify areas where improvement will save $$$ in [bandwidth|speed|latency|etc]
    • We can supply one or more interns to do all the internal work as well as work on a few other projects of your choosing
    • You (manager, CEO, IT lackey) got your degree here and still have fuzzy feelings for the school
    • Oh benevolent ones! May we sip at the firehose? Verily, this research will help this university provide graduates of the caliber which will dazzle the eyes! Yea, they will be cheap, too.
    The key here, as in everything to do with business, is to network, network, network. Don't email - you cannot possibly explain your research in a way that will make them go, "Gee, I think I'd like to devote company resources to these kids tha the university of whatever!" in an email. At best send an email such as, "Dear sir, blah blah blah, we are researching n-tier applications and would like to spend a few moments talking with you about your architecture. When would be a good time to call?" Give it two days - Call them in any case except if they patently refuse to talk to you. Don't engage in email conversations - in order to get good buy-in, you need to talk to them (if only briefly) so they can associate a voice with the email. Then email all you want.

    You may have better luck calling at the outset, intriducing yourself and your research, then asking who at the company would be suited to help you out with your research. Then engage that person. Don't get too low on the totem pole or you may end up with someone who is inneffective within the company at getting you what you want. Certian companies (Google, forinstance) are resource rich and may be easier to work with, especially if you can get one or two workers involved and spending their 20% time helping you. If your research isn't exciting on a general level, you're in for a rough ride.

    Once you've started a conversation (with several people at different companies - you're still trying to get something they will be reluctant to give) then you can start edging into what you need to complete your research. This whole process will take 2-6 months just to set everything up. I hope you've started early.

    Good luck.

    -Adam
  5. be more helpful, talk to open-source website by sonamchauhan · · Score: 2, Interesting

    ...
    > if this results in any research paper publications,
    > we would give credit to the supplier of the data.

    If that's all you offer in return, which company will allocate the resources to verify:?
      (a) this breaches no privacy laws (b) business advantage isn't sacrificed? ...And rightly so for companies whose constitution is 'maximise profit'.

    Some suggestions:

    1. Offer a quid-pro-quo to companies you contact: in return for access, you will deliver (say) a multi-page detailed architectural review and specific recommendations on potential improvements, reviewed, say, by your professor.

    2. Talk to people who run websites for non-profits, or open-source/ creative-commons websites like wikipedia.org, sourceforge.net, even slashdot. The attitude there may be more sympathetic to your efforts and the admins more willing to knock up a few Perl scripts to strip logs of sensitive information.

    3. Offer to be a website maintainer for a large indepedent open-source / community effort and obtain agreement on your access to logs.

  6. I can probably help by abradsn · · Score: 2, Interesting

    I've written a couple of these (including one for an extremely large software company that I'm sure you've heard of), and I'm currently working on one right now on the side (for my own personal gain).

    If you reply to this comment with your email address, then maybe we can work something out.

    I need some help with testing my current project, and you need some data. It 's actually more work for me to have someone besides myself test the software but the quality should be higher and it could help you out.

    It would probably take at least a few weeks of work on your end, but you would definately have your data at the end of it. Thanks, Brad

  7. Re:own them by SillySnake · · Score: 2, Insightful

    Oddly enough, you might own them. Surely a 'well known' university in the US has a website that gets plenty of hits a day. I'd start by looking there, and if they refuse to give them straight to you, a professor with some pull should be able to get any information you'd need from the IT department.