Is Data Science For All the New Computer Science For All? (berkeley.edu)
UC Berkeley's fastest-growing class is their introduction to data science. (The Wall Street Journal calls it a combination of computer science and statistics "to mine the growing troves of data on everything from traffic patterns to the habits of social-media users.") But that's only the beginning. UC Berkeley plans to create a new Division of Data Science -- one of their biggest reorganizations in decades -- and this fall they even began offering a major in data science. "The division will enable students and researchers to tackle not just the scientific challenges opened up by pervasive data, but the societal, economic and environmental impacts as well."
"We need to consider the ethical implications of these technologies as they are being developed," says Data 8 instructor David Wagner -- "what does the world look like when decisions are made by algorithms rather than people, and how do we ensure that when we analyze data our decisions reflect not just numbers but the humans behind them?"
Slashdot reader theodp writes: With a reported 1,295 students enrolled this semester, Berkeley's Data 8: The Foundations of Data Science boasts even bigger numbers than Harvard's most popular course, the more traditionally CS-focused CS50, which saw 724 students enroll this Fall....
Berkeley's embrace of Data Science coincidentally comes as Code.org is giving kudos to partners Microsoft, Facebook, Google, and Amazon for helping it convince lawmakers and tens of thousands of educators that more traditional computer science is what's needed for the K-12 masses, including the adoption of a new AP Computer Science program for high school students (an AP CS version of CS50 was funded by Microsoft).
So, is Data Science for All the new Computer Science for All? And, if so, will U.S. schools be looking at a major case of buyer's remorse?
"We need to consider the ethical implications of these technologies as they are being developed," says Data 8 instructor David Wagner -- "what does the world look like when decisions are made by algorithms rather than people, and how do we ensure that when we analyze data our decisions reflect not just numbers but the humans behind them?"
Slashdot reader theodp writes: With a reported 1,295 students enrolled this semester, Berkeley's Data 8: The Foundations of Data Science boasts even bigger numbers than Harvard's most popular course, the more traditionally CS-focused CS50, which saw 724 students enroll this Fall....
Berkeley's embrace of Data Science coincidentally comes as Code.org is giving kudos to partners Microsoft, Facebook, Google, and Amazon for helping it convince lawmakers and tens of thousands of educators that more traditional computer science is what's needed for the K-12 masses, including the adoption of a new AP Computer Science program for high school students (an AP CS version of CS50 was funded by Microsoft).
So, is Data Science for All the new Computer Science for All? And, if so, will U.S. schools be looking at a major case of buyer's remorse?
As with all things, education needs to be tied to reality.
Society needs (or wants!) certain things, and the only sustainable and humane way to figure out what society wants, how much of it society wants, and who should be paying for it is Capitalism.
Data science for all? Let those businesses who are seeking data scientists recruit promising folks, and pay for their education in an apprentice-style program. The government has business playing around with this nonsense.
Can't you people see it? It's right in the goddamn summary:
Government corrupts business, not the other way around. They are just using their deep pockets to pay Big Government to swing its pistol this way and that; there needs to be a Separation of Business and State; there needs to be a Separation of Education and State.
Infomatics or "Data Science" is bullshit. Notice, this is the definition of bullshit that amounts to mindlessly following upon a course, right or wrong, because one truly believes whatever the course is. There is no reason to believe that computers, algorithms, machine learning, and/or large troves of data will produce meaningful results. Just think of all the effort Facebook and Google alike go to push ads on you to try to motivate you to buy things. Even beyond the scope of reverse psychology or simply the agenda of Facebook/Google beyond the support of their advertisers, it's patently clear that all these efforts are incredibly dismal in effectiveness.
I don't see this radically changing because the dynamics of "gaming" the system are far from something that even humans are capable of fully grasping, let alone a complex system. There's also the obvious point that what people want is often something Facebook/Google refuses to offer either being of legal restraints or because of their own political agenda. It might seem obvious, but try using a VPN to looking up something unsavory but legal and see how useless the results are at producing what you aim for either in search results or ads.
So, agenda aside and capability aside, the truth is that what people in power (which is what this is really about) want to use their power to influence and control people. They have some ability to do this, but there's substantial amounts of blowback when done badly. Usually it's done badly. So, yes, we should consider the ethical concerns, but I tend to view it all as a load of bullshit aimed to drain the coffers of the powerful. Perhaps others are into that. Personally, I don't like wading through bullshit.
>> Is Data Science For All the New Computer Science?
Sure, change the name if it helps you attract funding and place graduates. I've been doing what we currently call "big data" or "data science" since the mid-nineties...and that was with a comp sci degree...issued by a math department.
sorry, still can't get behind "data" as a "science" ... Most of what I see that is called "data science" is statistics and visualization. These are worthwhile and challenging pursuits, but that doesn't make them science. The few things that do occasionally resemble "science" about "data science" are a lot closer to mathematics.
Computer Science is an Area of Discipline that was an offshoot of Mathematics. So Computer Science was made as a discipline, which is lighter on Mathematics, then a full Math Major, however more focus on Computational and algorithm design. A Computer Science is a Math Light Degree, but it isn't a light degree, just different topics. So Data Science, is an off shoot of computer science, as it allows more for data analysis and less on algorithms.
So me as a Computer Scientist, I do a lot of data analysis, but it is a learned form experience skill more them a taught skill. Just as the generation before me, who had Math Degrees, became Computer Scientists not because it was taught in school at such degree, but because of work experience got them good at such job.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Using big data as the sledge hammer of the mighty priests of Science! Results are Settled Science! Deniers shall be punished!
Do you need a CS degree to understand it?
Four years from now the alumni of these courses will be able to take data about the number of college courses, the number of graduates emerging therefrom, the number of jobs available and the salaries offered and spot some really interesting patterns.
Because one thing's for sure - they'll have the time.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Yes, yesh he should.
... group of employees to complain that their hard work is being used for evil government surveillance and hold walk-outs to protest their work being used for evil purposes, other than marketing?
Communicating with non specialists is important
This requires a lot of skill
one area where fails are easy to see is in graphs
if you have a time series with >3 variables, color is not a good option
don't listen to tuft; listen to robbins http://www.nbr-graphs.com/
pie charts are ok
Y axis that doesn't start at zero is ok; you have to use judgement
make it simple
For this reason, God sends them a powerful delusion(operation of wandering)(planet) so that they will believe the lie.
Logic, set theory, factoring patterns/relationships to remove repetition, and statistics should be among the basics. Specific languages often get one mired down in syntax and symbols. Save that for later.
Table-ized A.I.
I did that for 24 yrs. (Data Processing coding & information systems in client-server database design + implementation): "There is nothing NEW under the sun" & my MIS degree gave me the Stat I/II & CS degreework later gave me the DB skills (excel too but I always found databases MORE useful overall).
* I.E. -. My entire JOB (& that of many here I suspect also) was aiding mgt. in making BETTER DECISIONS via "predication" & use of algorithms + reports for them to do so.
APK
P.S.=> More Computer Programming jobs existed there than ANY other type in the 1994-2007 timeframe I was a programmer-analyst/software-engineer - it IS the "steady-eddy" slow & steady wins the race of the CS job field imo... apk
Today it's chocolate milk and a wooden club. Tomorrow who knows if the far left gets any real power.
A great use of data science would be Improving, or at the very least flagging nonsensical story titles.
Or identify speakers to run off campus with a riot, in the cause of "free speech"?
How about a basic course in logic, like the people who objected to Amazon's AI resume reviewer preferring men getting taught what "post hoc ergo propter hoc" means.
The only thing that should concern us even more than black box algorithms is knowing that the people above will not rest until the algorithm gives them the expected output. Even if that means effectively demanding "garbage out, no matter the input."
Instead of assigning letter grades for the course (or even pass/fail), each student is awarded a participation trophy.
Data Archeaology is probably a closer term than Data Science. In some cases Data Paleontology.
Data Science students are already applying for internships and first jobs from many schools. These kids have been led to believe their degree is a CS or SE equivalent with a focus on machine learning. Everyone of these kids that comes in has had 4 years of python using tensor flow or cafe and that's it. The machine learning in industry is about custom accelerators, assembly level programming, hardware design, compilers, and custom languages. We've already had to blacklist data science degrees from every school so far. These poor kids are unemployable in the very field they are so passionate about. We can hire a solid CS/SE graduate and teach ML/data science in 6 - 12 months on the job.
And engineering is more like 'tinker' till it works better without understanding why.
Take a look at Tensorflow, relu or Maxout activiation functions and the explanations are pretty ropy. Neither function fits the original purpose of an activation function (like sigmoid), someone tinkered and found it worked better.
But then it is a computer, you can simply try stuff, and the computer tests against the real case data, not against your understanding of that case data, which may be incorrect.
I have an algo called 'Corres', it was a pure in its first draft, a gradient walk optimizer down the line of strongest correlation. Then I tweaked things and it worked a lot better, and I could not explain why it worked better. (Shove noisy data in, get clean function out fitted to the non noisy portion of the data).
I really don't want to look at the code again to understand how it works..... lest it be magic and I break the spell!
Implementing it a billion times is not the problem, getting a paper model to work on real data is the problem, and you may as well go straight in and play.
Every single job I see with the word "data" in the name has some of the most comically excessive and overbroad requirements along with every adjective in the thesaurus for "expert". Positions described as "entry level" demand 5+ years of experience in a half dozen technologies ranging from python and SQL to tensorflow, hadoop, spark, and you have be a ninja, wizard, expert, and rockstar in all of them. As for degrees? That's the most hilarious part. They'll take anything from computer science to economics as long as it's a "highly quantitative field".
Personally I'd rather take someone who proves they understand how to work with noisy and ugly real world data, and tell when the numbers are bullshit, and teach them to code than take someone who knows how to code and try to teach them to grok data.
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
On Slashdot not too long ago, there was some question about whether everyone should be taught programming in school, and I commented that statistics would be far more useful.
Well, here we are, assuming "data science" is just a wanky name for statistics.
then "muh data science" is a fraud, a sham designed to part fools from their money and votes.
...and probably the new bad science for all. Good science starts with a problem or question and then gathers or finds the appropriate data to answer it. Doing it the other way around is called p-hacking, which more often than not gives meaningless patterns and correlations.
No.
Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
It was plucky... and had a nice beat that I could dance to....
I actually did like the way they wrapped the NumPy... they should tighten that up a bit and make it industrial strength...
Good luck with that. As someone that's been hiring a lot over the past 10 years... It's much easier to find someone that knows how to code AND knows how to understand business problems.
Not everyone gets coding... from https://www.joelonsoftware.com/2006/10/25/the-guerrilla-guide-to-interviewing-version-30/ :
>> I’ve come to realize that understanding pointers in C is not a skill, it’s an aptitude. In first year computer science classes, there are always about 200 kids at the beginning of the semester, all of whom wrote complex adventure games in BASIC for their PCs when they were 4 years old. They are having a good ol’ time learning C or Pascal in college, until one day the professor introduces pointers, and suddenly, they don’t get it. They just don’t understand anything any more. 90% of the class goes off and becomes Political Science majors, then they tell their friends that there weren’t enough good looking members of the appropriate sex in their CompSci classes, that’s why they switched. For some reason most people seem to be born without the part of the brain that understands pointers. Pointers require a complex form of doubly-indirected thinking that some people just can’t do, and it’s pretty crucial to good programming. A lot of the “script jocks” who started programming by copying JavaScript snippets into their web pages and went on to learn Perl never learned about pointers, and they can never quite produce code of the quality you need.
>>
>> That’s the source of all these famous interview questions you hear about, like “reversing a linked list” or “detect loops in a tree structure.”
>>
>> Sadly, despite the fact that I think that all good programmers should be able to handle recursion and pointers, and that this is an excellent way to tell if someone is a good programmer, the truth is that these days, programming languages have almost completely made that specific art unnecessary. Whereas ten years ago it was rare for a computer science student to get through college without learning recursion and functional programming in one class and C or Pascal with data structures in another class, today it’s possible in many otherwise reputable schools to coast by on Java alone.