What is Big Data? From below powerpoint by Jeffrey Popyack and William Mongan
We spent this morning talking about Big Data. Big Data is such an integral part of computing, and will essentially revolutionize how we live as a species. Never before have we stored so much information, and we’re now figuring out ways to process this data. It’s super interesting, as it will completely change the way that we do science. Here’s some notes I took (I’ll post the .ppt below):
1 byte is needed to store a single letter, digit, or number “Big Data”= 8 bytes (ASCII)
The text of Dr. Seuss’ “Green Eggs & Ham” is 3.3 kilobytes in size.
Hadoop
Hadoop Distributed Filesystem (HDFS)
An equal amount of work is not necessarily equal
Latency due to processor speed and distances
Parallel Computing
JSMAPREDUCE
Big Data Powerpoint- William Mongan and Jeff Popyack
We then used JSMapReduce to try and sort how many states contain what the most common city/town name is. Here’s the site (well, webarchive) of JSMapReduce. The code we ended up using is:
function Mapper(jsmr_context, data)
{
// separate a line of data into separate entries …
var words_list = data.split(‘\t’);// extract the city and state name and output as a pair
jsmr_context.Emit(words_list[2], words_list[5]);
}
function Reducer(jsmr_context, city)
{
states_map = {}
// (key,value-list) is (city,[state1,state2,state3,…])
var number_of_states = 0;
while (jsmr_context.HaveMoreValues())
{
var state = jsmr_context.GetNextValue();
// count how many times a new state appears
if( !(state in states_map) )
{
states_map[state] = 1 ;
number_of_states ++ ;
}
}
jsmr_context.Emit(city + ‘:’ + number_of_states) ;
}
After lunch (at the Shake Shack), I spent some time reading about MATLAB on the internet. Here’s some of the sites I used to learn the basics of MATLAB:
http://www.math.utah.edu/~wright/misc/matlab/matlabintro.html
http://www.cs.dartmouth.edu/~mckeeman/references/matlab101/matlab101.html
http://www.yorku.ca/jdc/Matlab/
https://www.cds.caltech.edu/~murray/wiki/images/4/45/Matlab_tutorial.pdf
http://aar.faculty.asu.edu/classes/eee480S12/matlab_primer.pdf