Slashdot Mirror


Is Data Science For All the New Computer Science For All? (berkeley.edu)

UC Berkeley's fastest-growing class is their introduction to data science. (The Wall Street Journal calls it a combination of computer science and statistics "to mine the growing troves of data on everything from traffic patterns to the habits of social-media users.") But that's only the beginning. UC Berkeley plans to create a new Division of Data Science -- one of their biggest reorganizations in decades -- and this fall they even began offering a major in data science. "The division will enable students and researchers to tackle not just the scientific challenges opened up by pervasive data, but the societal, economic and environmental impacts as well."

"We need to consider the ethical implications of these technologies as they are being developed," says Data 8 instructor David Wagner -- "what does the world look like when decisions are made by algorithms rather than people, and how do we ensure that when we analyze data our decisions reflect not just numbers but the humans behind them?"

Slashdot reader theodp writes: With a reported 1,295 students enrolled this semester, Berkeley's Data 8: The Foundations of Data Science boasts even bigger numbers than Harvard's most popular course, the more traditionally CS-focused CS50, which saw 724 students enroll this Fall....

Berkeley's embrace of Data Science coincidentally comes as Code.org is giving kudos to partners Microsoft, Facebook, Google, and Amazon for helping it convince lawmakers and tens of thousands of educators that more traditional computer science is what's needed for the K-12 masses, including the adoption of a new AP Computer Science program for high school students (an AP CS version of CS50 was funded by Microsoft).

So, is Data Science for All the new Computer Science for All? And, if so, will U.S. schools be looking at a major case of buyer's remorse?

51 comments

  1. Capitalism by Anonymous Coward · · Score: 3, Insightful

    As with all things, education needs to be tied to reality.

    Society needs (or wants!) certain things, and the only sustainable and humane way to figure out what society wants, how much of it society wants, and who should be paying for it is Capitalism.

    Data science for all? Let those businesses who are seeking data scientists recruit promising folks, and pay for their education in an apprentice-style program. The government has business playing around with this nonsense.

    Can't you people see it? It's right in the goddamn summary:

    Berkeley's embrace of Data Science coincidentally comes as Code.org is giving kudos to partners Microsoft, Facebook, Google, and Amazon for helping it convince lawmakers and tens of thousands of educators that more traditional computer science is what's needed for the K-12 masses

    Government corrupts business, not the other way around. They are just using their deep pockets to pay Big Government to swing its pistol this way and that; there needs to be a Separation of Business and State; there needs to be a Separation of Education and State.

    1. Re: Capitalism by Anonymous Coward · · Score: 0

      I computer science they teach nothing about data but they teach extremely powerful tools for working with data. In data science they donâ(TM)t teach much about data and they teach more generalized tool for analyzing data. It may be assumed in both cases that everyone has the same access and goals with respect to data so maybe that is why they avoid dealing with real data sets.

    2. Re: Capitalism by RhettLivingston · · Score: 1

      If that is the case, then it has become to specialized. I switched from computer science to computer engineering mid-program back in the 80s, but both had multiple statistics classes as well as database classes (different but related).

      Data Science should be part of the fundamental base for any science or engineering curriculum. It is kind of hard to perform any science or engineering without it.

  2. Infomatics is Bullshit by Anonymous Coward · · Score: 0

    Infomatics or "Data Science" is bullshit. Notice, this is the definition of bullshit that amounts to mindlessly following upon a course, right or wrong, because one truly believes whatever the course is. There is no reason to believe that computers, algorithms, machine learning, and/or large troves of data will produce meaningful results. Just think of all the effort Facebook and Google alike go to push ads on you to try to motivate you to buy things. Even beyond the scope of reverse psychology or simply the agenda of Facebook/Google beyond the support of their advertisers, it's patently clear that all these efforts are incredibly dismal in effectiveness.

    I don't see this radically changing because the dynamics of "gaming" the system are far from something that even humans are capable of fully grasping, let alone a complex system. There's also the obvious point that what people want is often something Facebook/Google refuses to offer either being of legal restraints or because of their own political agenda. It might seem obvious, but try using a VPN to looking up something unsavory but legal and see how useless the results are at producing what you aim for either in search results or ads.

    So, agenda aside and capability aside, the truth is that what people in power (which is what this is really about) want to use their power to influence and control people. They have some ability to do this, but there's substantial amounts of blowback when done badly. Usually it's done badly. So, yes, we should consider the ethical concerns, but I tend to view it all as a load of bullshit aimed to drain the coffers of the powerful. Perhaps others are into that. Personally, I don't like wading through bullshit.

    1. Re:Infomatics is Bullshit by Anonymous Coward · · Score: 0

      You keep thinking that. I'll keep collecting my rather large pay cheques for being engaged in machine learning research all day.

    2. Re:Infomatics is Bullshit by sfcat · · Score: 1

      You keep thinking that. I'll keep collecting my rather large pay cheques for being engaged in machine learning research all day.

      Then you aren't doing data science. Data science is applied ML. You are either inventing new algorithms (ML research) or data science (applied ML). Which is it?

      --
      "Those that start by burning books, will end by burning men."
    3. Re:Infomatics is Bullshit by q_e_t · · Score: 1

      Data science and machine learning overlap, but there are elements of data science that are concerned with plain old statistics, rather than machine learning.

    4. Re:Infomatics is Bullshit by q_e_t · · Score: 1

      I am trying to get back into the area after a gap, and it is proving to be tough, even with 20 years of prior experience.

  3. This just in: computers use math by xxxJonBoyxxx · · Score: 1

    >> Is Data Science For All the New Computer Science?

    Sure, change the name if it helps you attract funding and place graduates. I've been doing what we currently call "big data" or "data science" since the mid-nineties...and that was with a comp sci degree...issued by a math department.

    1. Re:This just in: computers use math by ShanghaiBill · · Score: 1

      Sure, change the name if it helps you attract funding and place graduates.

      Exactly. UC is getting less and less of its funding from the state, and more from tuition. So they need to run the university like a business. If the applying students (customers) want data science degrees, then that is what you sell to them.

      If the students learn skills that businesses want, everyone is happy, and it doesn't really matter what the degree is called.

  4. data science by Anonymous Coward · · Score: 0

    sorry, still can't get behind "data" as a "science" ... Most of what I see that is called "data science" is statistics and visualization. These are worthwhile and challenging pursuits, but that doesn't make them science. The few things that do occasionally resemble "science" about "data science" are a lot closer to mathematics.

    1. Re: data science by Anonymous Coward · · Score: 0

      The general rule is that any subject with 'science' in its name will not have much science in it, computer science is an exception

    2. Re: data science by Anonymous Coward · · Score: 0

      But it doesn't have much computers in it.

    3. Re: data science by Anonymous Coward · · Score: 0

      The computer scientists are the ones who do the computation at first. Once they figure it out, it is easy(ier)to write a program to do it on a computer a billion times over and over as needed if you can't afford to have a CS handy.

    4. Re:data science by Anonymous Coward · · Score: 0

      Medical doctors (generally) don't call themselves scientists. Neither do engineers, though both disciplines require understanding many principles discovered through science. The reason they don't call what they do science is because they aren't scientists. And that's ok.

      What the hell is wrong with being a statistician?

    5. Re: data science by K.+S.+Kyosuke · · Score: 1

      Or science, because it's actually mathematics.

      --
      Ezekiel 23:20
    6. Re: data science by Hognoxious · · Score: 1

      The general rule is that any subject with 'science' in its name will not have much science in it, computer science is an exception

      I used to take some courses in the CS department. Never once saw a single test tube, microscope or Bunsen burner.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  5. Splitting of diciplines. by jellomizer · · Score: 1

    Computer Science is an Area of Discipline that was an offshoot of Mathematics. So Computer Science was made as a discipline, which is lighter on Mathematics, then a full Math Major, however more focus on Computational and algorithm design. A Computer Science is a Math Light Degree, but it isn't a light degree, just different topics. So Data Science, is an off shoot of computer science, as it allows more for data analysis and less on algorithms.
    So me as a Computer Scientist, I do a lot of data analysis, but it is a learned form experience skill more them a taught skill. Just as the generation before me, who had Math Degrees, became Computer Scientists not because it was taught in school at such degree, but because of work experience got them good at such job.

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    1. Re:Splitting of diciplines. by garcia · · Score: 2

      Data Science is more than the math-heavy side of CS; it should include a lot of business courses too as the single most important part of being a Data Scientist is understanding the business context of the models being built.

      Business Analytics courses try to make a Business-heavy Data Science program; however, there can be balance there, IMO.

      I have worked in the field (Data Engineering/ETL focus) for a decade and watched the massive changes in tools, need and understanding. These sorts of programs are doing a great job but still need to do more, based on what Iâ(TM)ve seen to date.

    2. Re:Splitting of diciplines. by Anonymous Coward · · Score: 0

      Data Science is more than the math-heavy side of CS; it should include a lot of business courses too as the single most important part of being a Data Scientist is understanding the business context of the models being built.

      In other words, money rules. Plenty of CS work is not solely to find a way for a company to make a buck, or a billion of them.

    3. Re:Splitting of diciplines. by AHuxley · · Score: 1

      The US is still going to need a really great math education system.
      Calling it Data Science still results in the need for math. Sooner or later the students will have to learn a lot more advanced math.
      People who entered on merit will do well as they know how to study and can learn more math.
      People who got selected on considerations other that the ability to study math will have a lot of math to do.

      --
      Domestic spying is now "Benign Information Gathering"
  6. The modern 'How to Lie with Statistics' class by Anonymous Coward · · Score: 0

    Using big data as the sledge hammer of the mighty priests of Science! Results are Settled Science! Deniers shall be punished!

  7. This headline by Anonymous Coward · · Score: 0

    Do you need a CS degree to understand it?

  8. If it's a trendy major it's already too late by Hognoxious · · Score: 2

    Four years from now the alumni of these courses will be able to take data about the number of college courses, the number of graduates emerging therefrom, the number of jobs available and the salaries offered and spot some really interesting patterns.

    Because one thing's for sure - they'll have the time.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    1. Re:If it's a trendy major it's already too late by SaBumNim · · Score: 1

      It's not trendy because it sounds cool. It's trendy because it's a tremendous need. Touchpoints for data are going to grow exponentially as we have an Internet of more things. We're only at the beginning of the amount of data we're going to store and have available to analyze. We have a long way to go before we make good, data driven decisions in even everyday cases (Daylight Savings Time, anyone).

    2. Re:If it's a trendy major it's already too late by Hognoxious · · Score: 1

      Is that mathematical exponentially or journalistic exponentially?

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  9. Should EditorDavid fuck off with the questions? by Anonymous Coward · · Score: 0

    Yes, yesh he should.

  10. Will these students be the next ... by WoodstockJeff · · Score: 1

    ... group of employees to complain that their hard work is being used for evil government surveillance and hold walk-outs to protest their work being used for evil purposes, other than marketing?

  11. Tip for the wise by Anonymous Coward · · Score: 0

    Communicating with non specialists is important
    This requires a lot of skill
    one area where fails are easy to see is in graphs
    if you have a time series with >3 variables, color is not a good option
    don't listen to tuft; listen to robbins http://www.nbr-graphs.com/

    pie charts are ok

    Y axis that doesn't start at zero is ok; you have to use judgement

    make it simple

  12. Unsettled science: What is Winter Sunlight? by Anonymous Coward · · Score: 0

    For this reason, God sends them a powerful delusion(operation of wandering)(planet) so that they will believe the lie.

  13. The basics by Tablizer · · Score: 1

    Logic, set theory, factoring patterns/relationships to remove repetition, and statistics should be among the basics. Specific languages often get one mired down in syntax and symbols. Save that for later.

  14. Did it for 24 yrs.: Data Processing/InfoSys by Anonymous Coward · · Score: 0

    I did that for 24 yrs. (Data Processing coding & information systems in client-server database design + implementation): "There is nothing NEW under the sun" & my MIS degree gave me the Stat I/II & CS degreework later gave me the DB skills (excel too but I always found databases MORE useful overall).

    * I.E. -. My entire JOB (& that of many here I suspect also) was aiding mgt. in making BETTER DECISIONS via "predication" & use of algorithms + reports for them to do so.

    APK

    P.S.=> More Computer Programming jobs existed there than ANY other type in the 1994-2007 timeframe I was a programmer-analyst/software-engineer - it IS the "steady-eddy" slow & steady wins the race of the CS job field imo... apk

  15. Re:Berkeley, eh? by Anonymous Coward · · Score: 0

    Today it's chocolate milk and a wooden club. Tomorrow who knows if the far left gets any real power.

  16. A good used of data science by Anonymous Coward · · Score: 0

    A great use of data science would be Improving, or at the very least flagging nonsensical story titles.

  17. Re:Berkeley, eh? by Brett+Buck · · Score: 1

    Or identify speakers to run off campus with a riot, in the cause of "free speech"?

  18. Heh... "ethics" by MikeRT · · Score: 2

    How about a basic course in logic, like the people who objected to Amazon's AI resume reviewer preferring men getting taught what "post hoc ergo propter hoc" means.

    The only thing that should concern us even more than black box algorithms is knowing that the people above will not rest until the algorithm gives them the expected output. Even if that means effectively demanding "garbage out, no matter the input."

    1. Re:Heh... "ethics" by Anonymous Coward · · Score: 0

      Not the "expected" output - the "desired" output.

  19. DS-101: Passive-Agressive Numerical Methods by Anonymous Coward · · Score: 0

    Instead of assigning letter grades for the course (or even pass/fail), each student is awarded a participation trophy.

  20. New Fad For All by Anonymous Coward · · Score: 0

    Data Archeaology is probably a closer term than Data Science. In some cases Data Paleontology.

  21. Poor duped students... by Anonymous Coward · · Score: 0

    Data Science students are already applying for internships and first jobs from many schools. These kids have been led to believe their degree is a CS or SE equivalent with a focus on machine learning. Everyone of these kids that comes in has had 4 years of python using tensor flow or cafe and that's it. The machine learning in industry is about custom accelerators, assembly level programming, hardware design, compilers, and custom languages. We've already had to blacklist data science degrees from every school so far. These poor kids are unemployable in the very field they are so passionate about. We can hire a solid CS/SE graduate and teach ML/data science in 6 - 12 months on the job.

    1. Re: Poor duped students... by Anonymous Coward · · Score: 0

      I agree. We've interviewed a large number of data science majors. They can manually analyze data, but have no clue how to tie that analysis into any real system, or make it into a generally usable process. I think we found one, but only because he taught himself some CS on the side.

  22. And computer engineering is by Anonymous Coward · · Score: 0

    And engineering is more like 'tinker' till it works better without understanding why.

    Take a look at Tensorflow, relu or Maxout activiation functions and the explanations are pretty ropy. Neither function fits the original purpose of an activation function (like sigmoid), someone tinkered and found it worked better.

    But then it is a computer, you can simply try stuff, and the computer tests against the real case data, not against your understanding of that case data, which may be incorrect.

    I have an algo called 'Corres', it was a pure in its first draft, a gradient walk optimizer down the line of strongest correlation. Then I tweaked things and it worked a lot better, and I could not explain why it worked better. (Shove noisy data in, get clean function out fitted to the non noisy portion of the data).

    I really don't want to look at the code again to understand how it works..... lest it be magic and I break the spell!

    Implementing it a billion times is not the problem, getting a paper model to work on real data is the problem, and you may as well go straight in and play.

  23. The problem is industry doesn't know what it wants by Shadow+of+Eternity · · Score: 2

    Every single job I see with the word "data" in the name has some of the most comically excessive and overbroad requirements along with every adjective in the thesaurus for "expert". Positions described as "entry level" demand 5+ years of experience in a half dozen technologies ranging from python and SQL to tensorflow, hadoop, spark, and you have be a ninja, wizard, expert, and rockstar in all of them. As for degrees? That's the most hilarious part. They'll take anything from computer science to economics as long as it's a "highly quantitative field".

    Personally I'd rather take someone who proves they understand how to work with noisy and ugly real world data, and tell when the numbers are bullshit, and teach them to code than take someone who knows how to code and try to teach them to grok data.

    --
    A bullet may have your name on it but splash damage is addressed "To whom it may concern."
  24. Called it by enrique556 · · Score: 1

    On Slashdot not too long ago, there was some question about whether everyone should be taught programming in school, and I commented that statistics would be far more useful.
    Well, here we are, assuming "data science" is just a wanky name for statistics.

  25. If Nate Plastic is anything to go by by Anonymous Coward · · Score: 0

    then "muh data science" is a fraud, a sham designed to part fools from their money and votes.

  26. Maybe... by Anonymous Coward · · Score: 0

    ...and probably the new bad science for all. Good science starts with a problem or question and then gathers or finds the appropriate data to answer it. Doing it the other way around is called p-hacking, which more often than not gives meaningless patterns and correlations.

  27. Easy by aglider · · Score: 1

    No.

    --
    Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
  28. I took that Berkeley data science course... by Anonymous Coward · · Score: 0

    It was plucky... and had a nice beat that I could dance to....

    I actually did like the way they wrapped the NumPy... they should tighten that up a bit and make it industrial strength...

  29. Re:The problem is industry doesn't know what it wa by Anonymous Coward · · Score: 0

    Good luck with that. As someone that's been hiring a lot over the past 10 years... It's much easier to find someone that knows how to code AND knows how to understand business problems.

    Not everyone gets coding... from https://www.joelonsoftware.com/2006/10/25/the-guerrilla-guide-to-interviewing-version-30/ :

    >> I’ve come to realize that understanding pointers in C is not a skill, it’s an aptitude. In first year computer science classes, there are always about 200 kids at the beginning of the semester, all of whom wrote complex adventure games in BASIC for their PCs when they were 4 years old. They are having a good ol’ time learning C or Pascal in college, until one day the professor introduces pointers, and suddenly, they don’t get it. They just don’t understand anything any more. 90% of the class goes off and becomes Political Science majors, then they tell their friends that there weren’t enough good looking members of the appropriate sex in their CompSci classes, that’s why they switched. For some reason most people seem to be born without the part of the brain that understands pointers. Pointers require a complex form of doubly-indirected thinking that some people just can’t do, and it’s pretty crucial to good programming. A lot of the “script jocks” who started programming by copying JavaScript snippets into their web pages and went on to learn Perl never learned about pointers, and they can never quite produce code of the quality you need.
    >>
    >> That’s the source of all these famous interview questions you hear about, like “reversing a linked list” or “detect loops in a tree structure.”
    >>
    >> Sadly, despite the fact that I think that all good programmers should be able to handle recursion and pointers, and that this is an excellent way to tell if someone is a good programmer, the truth is that these days, programming languages have almost completely made that specific art unnecessary. Whereas ten years ago it was rare for a computer science student to get through college without learning recursion and functional programming in one class and C or Pascal with data structures in another class, today it’s possible in many otherwise reputable schools to coast by on Java alone.