U.C. Berkeley Offers Free "Big Data" Class This Week

← Back to Stories (view on slashdot.org)

U.C. Berkeley Offers Free "Big Data" Class This Week

Posted by samzenpus on Monday August 20, 2012 @01:47AM from the get-your-learnings dept.

pmdubs writes "The U.C. Berkeley AMPLab research group will be hosting a free 'Big Data Bootcamp' on-campus and online, August 21 and 22. The AMP Camp will feature hands-on tutorials on big data analysis using the AMPLab software stack, including Spark, Shark, and Mesos. These tools work hand-in-hand with technologies like Hadoop to provide high performance, low latency data analysis. AMP Camp will also include high level overviews of warehouse scale computing, presentations on several big data use-cases, and talks on related projects."

16 comments

Min score:

Reason:

Sort:

Oh good by Sarten-X · 2012-08-20 01:56 · Score: 2

Now maybe some of the folks here will actually learn how Big Data methodologies work, rather than just spamming links to a strawman argument starring the word "web-scale"...
Aw, who am I kidding... this is Slashdot! A knee-jerk reaction with little forethought is not only the norm, but the mandate!

--
You do not have a moral or legal right to do absolutely anything you want.
1. Re:Oh good by Trepidity · 2012-08-20 02:11 · Score: 4, Funny
  
  Unfortunately, learning how big-data methodologies work just isn't scalable.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
2. Re:Oh good by xclr8r · 2012-08-20 02:15 · Score: 1
  
  http://aws.amazon.com/free/ (requires credit card for verification but is free)
  
  run a free Amazon EC2 Micro Instance for a year, while also leveraging a free usage tier for Amazon S3, Amazon Elastic Block Store, Amazon Elastic Load Balancing, and AWS data transfer. AWS’s free usage tier can be used for anything you want to run in the cloud: launch new applications, test existing applications in the cloud, or simply gain hands-on experience with AWS.
  
  --
  Beware of those who profit off the docile and persecute the unbelievers.
3. Re:Oh good by Anonymous Coward · 2012-08-20 02:50 · Score: 0
  
  They're only offering the course for free so they can harvest your data, man!
  
  You're the product! YOU'RE THE PRODUCT!
  
  /puts on tin foil hat
4. Re:Oh good by medcalf · 2012-08-20 02:50 · Score: 1
  
  Aw, who am I kidding... this is Slashdot! A knee-jerk reaction with little forethought is not only the norm, but the mandate!
  Well demonstrated.
  
  --
  -- Two men say they're Jesus. One of them must be wrong. - Dire Straits
5. Re:Oh good by gl4ss · 2012-08-20 02:51 · Score: 1
  
  BGDB ..anyhow, anyone know when uc berkeley offered their first "cloud" class?
  
  --
  world was created 5 seconds before this post as it is.
6. Re:Oh good by AwesomeMcgee · 2012-08-20 02:53 · Score: 1
  
  I've been meaning to haskell one of those up to attempt an idempotent mesh data processing layer, AWS does sound like a lot of fun for play/learning, unfortunately so so does hulu and my kid so this has been on hold. Have you actually used any aws and found it to be easy/good for practicing scaling software?
7. Re:Oh good by QuantumRiff · 2012-08-20 03:16 · Score: 1
  
  I don't think you understand. If you just make the class completely unstructured, and stop worrying about data ^h^h i mean learning being guaranteed, you can exponentially increase the number of people you can educate at web-scale by just adding more instructors.
  
  --
  
  What are we going to do tonight Brain?
8. Re:Oh good by Amouth · 2012-08-20 03:24 · Score: 1
  
  if you can script your self then yes
  
  --
  '...if only "Jumping to a Conclusion" was an event in the Olympics.'
9. Re:Oh good by ArsonSmith · 2012-08-20 05:30 · Score: 1
  
  Back in the 60s they held them in multi-colored Volkswagen buses.
  
  --
  Paying taxes to buy civilization is like paying a hooker to buy love.
10. Re:Oh good by boogahboogah · 2012-08-20 07:23 · Score: 1
  
  And if you or one of your friends had access to 'the good stuff' you could be in the cloud for quite awhile...
next free class: grammar 101 by Anonymous Coward · 2012-08-20 04:56 · Score: 0

FTA: "...and walk through’s..."
How Relevant Are The Technologies? by MikeTheGreat · 2012-08-20 07:19 · Score: 1

Can anyone shed some light on whether these technologies are niche/minor technologies, or whether they're actually popular / useful / used technologies?
"I've never heard of AMPLab" means just about nothing, given that I don't really spend a lot of time on Big Data. I recognize Hadoop (and MapReduce, Scala, etc,etc), but most of the technologies used in this class seems to be specific to Berkeley.
(I'm almost afraid to ask, given that there's a grand total of 13 comments and it's already 1/2 down the /. main page :( )
How much of a potential market is there? by Anonymous Coward · 2012-08-20 07:52 · Score: 0

Some quick random googling turns this up:
ASUS P9X79
Support for up to 64GB of system memory with an 8-DIMM design
The sales literature says 32 GB/s ram speed. So in 2 seconds I can process and parse the crap out of 64 Gigabtyes of random unstructured text data. WOW!
How many businesses have a total data set even that big? (bean counting/ customer /text data, not pictures or video)
A 5 drive RAID and a quadcore motherboard with 64 Gigs running WIndows or Mac or LInux. That's a pretty simple standard readily available solution to what, 95% or 99% of 'Big Data' problems?
In 18 months, Moore's law will double that for the same price.
If I put my Marketing Hat on, I do not see Big Data as anything else but a small specialty market. And because it will never be mainstream, the big money will never be spent to make it easy to install, easy to program, easy to debug and easy on the enduser.
Oh ya. Uptime. These big clusters aren't very fault tolerant and run only a few days without something breaking. This ain't Novell.
iPhone and Android Users think 4 gig storage cards are big.
I have played with 'cluster' type computing. I tell you this. I will jump through a lot of hoops to make my application run on a single box under a mainstream OS before every again trying to keep a room full of boxes running.
Just moving all the right data around to the right nodes is a big pain that likes to break TCPIP stacks, routers, switches, OS's and the 4 gigabyte limit.
Getting the cluster 'booted' and 'booted reliably' each and 'every time' is a well earned excuse for much drinking.
Doing it 'in the cloud' just exchanges one set of problems for another.
Documentation is there, but it you got to be a true seeker, not deterred by anything.
I would take my estimate of project time and multiply it by 10 if the solution involves multiple boxes.
But if you need to query every byte in a terabyte in under a second, this is the only current solution.