| 
		 SAP
 
		Big Data 
		
		Der Spiegel, May 2013 
		Edited by Andy Ross 
	Big Data is the next big thing. It promises both total control and the 
	logical management of our future lives. An estimated 2.8 ZB of data was 
	created in 2012, with a predicted volume of 40 ZB by 2020. This exponential 
	growth doubles every two years.
 Google and Facebook are giants of Big 
	Data. But many other organizations are analyzing all this data. Memory is 
	cheap, so new computers can analyze a lot of data fast. Algorithms create 
	order from chaos. They find hidden patterns and offer new insights and 
	business models. Algorithms bring vast power.
 
 Blue Yonder is a small 
	young company. Managing director Uwe Weiss analyzes the data generated by 
	supermarket cash registers, weather services, vacation schedules, and 
	traffic reports. All this data flows into analysis software that learns as 
	it goes and finds new patterns. Blue Yonder has used its data to drive a 
	market research system on buying behavior. Weiss: "Big Data is currently 
	revamping our entire economy, and we're just at the beginning."
 
 Big 
	Data now brings hope for millions of cancer patients. In the Hasso Plattner 
	Institute (HPI), in Potsdam, near Berlin, a €1.5 million SAP HANA analytic 
	engine with a thousand cores has so much memory it can process Big Data 
	thousands of times faster than other machines. SAP co-founder Hasso Plattner 
	sponsors the institute and personally pushed the "Oncolyzer" rig. The HANA 
	in-memory technology has won prizes for innovation and is now the flagship 
	SAP platform.
 
 Researchers at the University of Manchester are working 
	on another Big Data project to help senior citizens who live alone. The 
	device is installed on the floor like an ordinary carpet, with sensors 
	recording footsteps. It can determine whether the person is up and about, 
	and can analyze activities to see how they compare with the person's normal 
	movements. Anomalies can trigger an alarm.
 
 The military and 
	intelligence communities also employ the power of data analysis. Big Data 
	played a key role in the hunt for Osama bin Laden, leading investigators to 
	Abbottabad in Pakistan.
 
 California software company Splunk ws named a 
	few weeks ago as one of the five most innovative companies in the world. 
	Governments, agencies, and businesses in almost a hundred countries are 
	customers, as are the Pentagon and the Department of Homeland Security. 
	Splunk apps analyze data supplied by all kinds of machines, including cell 
	phone towers, air-conditioners, web servers, and airplanes.
 
 Hamburg-based startup Kreditech lends money via the Internet. Instead of 
	requiring credit information from their customers, Kreditech determines the 
	probability of default using a social scoring method based on fast data 
	analysis. The company extracts as much data as possible from its users, 
	including personal data from EBay and Facebook profiles and other social 
	networking sites. It even records how long applicants take to fill out the 
	questionnaire, the frequency of errors and deletions, and what kind of 
	computer they use. The more information it has, the higher a customer's 
	potential credit line.
 
 Kreditech is expanding rapidly in eastern 
	Europe and plans to launch soon in Russia. But it terminated its service in 
	Germany when the Federal Financial Supervisory Authority (BaFin) proposed to 
	examine its business model. The model generates revenue not only from 
	microcredit deals and interest but also from renting credit scores to other 
	companies. Despite all this, investors find social scoring very attractive.
 
 Business models like Kreditech's illustrate the sensitivity of the 
	issues that Big Data raises. Users give up their data freely, bit by bit, 
	and everyone adds to this huge new data resource every day. But what happens 
	to a stash of credit profiles if its owners are taken over or go bust?
 
 TomTom, a Dutch manufacturer of GPS navigation equipment, sold its data 
	to the Dutch government, which then passed on the data to the police. They 
	used it to set up speed traps in places where they were most likely to 
	generate revenue from speeding TomTom users. TomTom issued a public apology.
 
 Big Data applications are especially valuable when they generate 
	personalized profiles. This may be appealing to retailers and some 
	consumers, but data privacy advocates see many Big Data concepts as Big 
	Brother scenarios of a completely new dimension.
 
 Many companies say 
	the data they gather, store, and analyze remains anonymous. But our mobility 
	patterns alone can be used to identify almost all of us uniquely. The more 
	data is in circulation and available for analysis, the more likely it is 
	that anonymity becomes algorithmically impossible.
 
 Most people don't 
	want companies to store their personal data or to track their online 
	behavior. A proposed European data protection directive includes a "right to 
	be forgotten" on the web. But this may be utopian. We face an impending 
	tyranny of algorithms.
 
	AR I worked in the SAP HANA development team 
	from 2003 to 2009.
 
	Big Data 
	
	SAPMIT Technology Review, May 2013
 
		Edited by Andy Ross 
	SAP likes Big Data. SAP is working with young companies to help them take 
	advantage of its revolutionary HANA in-memory Big Data platform. Some of the 
	most adroit users of Big Data are small startups. Fortune 1000 CIOs often 
	say they have a lot of data but haven't yet figured out a way to translate 
	it into real results.
 SAP HANA was the brainchild of Hasso Plattner 
	and Vishal Sikka. The HANA platform takes advantage of a new generation of 
	columnar databases running on multicore processors. The entire system is in 
	RAM, and users say data queries that used to take days now run in seconds.
 
	AR HANA was the brainchild of all of us in the 
	HANA team too.
 
	Google Brains For Big Data 
	
	Wired, May 2013 
		Edited by Andy Ross 
	Stanford professor Andrew Ng joined Google's X Lab to build huge AI systems 
	for working on Big Data.
 He ended up building the world's largest 
	artificial neural network (ANN). Ng's new brain watched YouTube videos for a 
	week and taught itself all about cats. Then it learned to recognize voices 
	and interpret Google StreetView images. The work moved from X Labs to the 
	Google Knowledge Team. Now "deep learning" could boost Google Glass, Google 
	image search, and even basic web search.
 
 Ng invited AI pioneer 
	Geoffrey Hinton to come to Mountain View and tinker with algorithms. Android 
	Jelly Bean included new algorithms for voice recognition and cut the error 
	rate by a quarter. Ng departed and Hinton joined Google, where he plans to 
	take deep learning to the next level.
 
 Hinton thinks ANN models of 
	documents could boost web search like they did voice recognition. Google's 
	knowledge graph is a database of nearly 600 million entities that when you 
	search for something pops up information about it to the right of your 
	search results. Hinton says ANNs could study the graph and then cull the 
	errors and refine new facts for it.
 
 ANN research has boomed as 
	researchers harness the power of graphics processors (GPUs) to build bigger 
	ANNs that can learn fast from Big Data. With unsupervised learning 
	algorithms the machines can learn on their own, but for really big ANNs 
	Google first had to write code that would harness all the machines and still 
	run if some nodes failed. It takes a lot of work to train ANN models. 
	Training the YouTube cat model used 16 000 chip cores. But then it took just 
	100 cores to spot cats in videos.
 
 Hinton aims to test a teranode ANN 
	soon.
 
	AR I like the idea of using ANNs for document 
	search. It will improve result relevance as much as Google probability-based 
	translation improved quality over rule-based translation.
 
	Big Data 
	
	Mark P. MillsCity Journal, July 2013
 
		Edited by Andy Ross 
	What makes Big Data useful is software. When the first microprocessor was 
	invented in 1971, software was a $1 billion industry. Software today has 
	grown to a $350 billion industry. Big Data analytics will grow software to a 
	multi-trillion dollar industry.
 Image data processing lets Facebook 
	track where and when vacationing is trending. Looking at billions of photos 
	over weeks or years and correlating them with related data sets (vacation 
	bookings, air traffic), tangential information (weather, interest rates, 
	unemployment), or orthogonal information (social or political trends), we 
	can associate massive data sets and unveil all manner of facts.
 
 Isaac 
	Asimov called the idea of using massive data sets to predict human behavior 
	psychohistory. The bigger the data set, he said, the more predictable the 
	future. With Big Data analytics, we can see beyond the apparently random 
	motion of a few thousand molecules of air to see the balloon they are 
	inside, and beyond that to the bunch of party balloons on a windy day. The 
	software world has moved from air molecules to weather patterns.
 
 The 
	new era will involve data collected from just about everything. Until now, 
	given the scale and complexities of commerce, industry, society, and life, 
	you couldn't measure everything, so you approximated by statistical sampling 
	and estimation. That era is almost over. Instead of estimating how many cars 
	are on a road, we will count each and every one in real time as well as 
	hundreds of related facts about each car.
 
 Big data sets can reveal 
	trends that tell us what will happen without the need to know why. With 
	robust correlations, you don't need a theory, you just know. Observational 
	data can yield enormously predictive tools. The why of many things that we 
	observe, from entropy to evolution, has eluded physicists and philosophers. 
	Big data may amplify our ability to make sense of nearly everything in the 
	world.
 
 The Big Data revolution is propelled by the convergence of 
	three technology domains: powerful but cheap information engines, ubiquitous 
	wireless broadband, and smart sensors. Nearly a century ago, the air travel 
	revolution was enabled by the convergent maturation of powerful combustion 
	engines, aluminum metallurgy, and the oil industry.
 
 Business surveys 
	show $3 trillion in the global information and communications technology 
	(ICT) infrastructure spending planned for the next decade. This puts Big 
	Data in the same league as Big Oil, projected to spend $5 trillion over the 
	same decade. All this is bullish for the future of the global economy.
 
	Big Data Analysis 
	
	By Jennifer OuelletteQuanta, October 9, 2013
 
		Edited by Andy Ross 
	Since 2005, computing power has grown largely by using multiple cores and 
	multiple levels of memory. The new architecture is no longer a single CPU 
	plus RAM and a hard drive. Supercomputers are giving way to distributed data 
	centers and cloud computing.
 These changes prompt a new approach to 
	big data. Many problems in big data are about managing the movement of data. 
	Increasingly, the data is distributed across multiple computers in a large 
	data center or in the cloud. Big data researchers seek to minimize how much 
	data is moved back and forth from slow memory to fast memory. The new 
	paradigm is to analyze the data in a distributed way, with each node in a 
	network performing a small piece of a computation. The partial solutions are 
	then integrated for the full result.
 
 MIT physicist Seth Lloyd says 
	quantum computing could assist big data by searching huge unsorted data 
	sets. Whereas a classical computer runs with bits (0 or 1), a quantum 
	computer uses qubits that can be 0 and 1 at the same time, in 
	superpositions. Lloyd has developed a conceptual prototype for quantum RAM 
	(Q-RAM) plus a Q-App — "quapp" — targeted to machine learning. He thinks 
	his system could find patterns within data without actually looking at any 
	individual records, to preserve the quantum superposition.
 
 Caltech 
	physicist Harvey Newman foresees a future for big data that relies on armies 
	of intelligent agents. Each agent records what is happening locally but 
	shares the information widely. Billions of agents would form a vast global 
	distributed intelligent entity.
 
 
	Privacy 
	
	Evgeny MorozovMIT Technology Review, October 22, 2013
 
		Edited by Andy Ross 
	Technology companies and government agencies have a shared interest in the 
	collection and rapid analysis of user data.
 The analyzed data can 
	help solve problems like obesity, climate change, and drunk driving by 
	steering our behavior. Devices can ping us whenever we are about to do 
	something stupid, unhealthy, or unsound. This preventive logic is 
	coercive. The technocrats can neutralize politics by replacing the messy 
	stuff with data driven administration.
 
 Privacy is not an end in 
	itself but a means of realizing an ideal of democratic politics where 
	citizens are trusted to be more than just suppliers of information to 
	technocrats. In the future we are sleepwalking into, everything seems to 
	work but no one knows exactly why or how. Too little privacy can endanger 
	democracy, but so can too much privacy.
 
 Democracies risk falling 
	victim to a legal regime of rights that allow citizens to pursue their own 
	private interests without any reference to the public. When citizens demand 
	their rights but are unaware of their responsibilities, the political 
	questions that have defined democratic life over centuries are subsumed into 
	legal, economic, or administrative domains. A democracy without engaged 
	citizens might not survive.
 
 The balance between privacy and 
	transparency needs adjustment in times of rapid technological change. The 
	balance is a political issue, not to be settled by a combination of 
	theories, markets, and technologies. Computerization increasingly appears as 
	a means to adapt an individual to a predetermined, standardized behavior 
	that aims at maximum compliance with the model patient, consumer, taxpayer, 
	employee, or citizen.
 
 Big data constrains how we mature politically 
	and socially. The invisible barbed wire of big data limits our lives to a 
	comfort zone that we did not choose and that we cannot rebuild or expand. 
	The more information we reveal about ourselves, the denser but more 
	invisible this barbed wire becomes. We gradually lose our understanding of 
	why things happen to us. But we can cut through the barbed wire. Privacy is 
	the resource that allows us to do that.
 
 Think of privacy in economic 
	terms. By turning our data into a marketable asset, we can control who has 
	access to it and we can make money. To ensure a good return on my data 
	portfolio, I need to ensure that my data is not already available elsewhere. 
	But my decision to sell my data will impact other people. People who hide 
	their data will be considered deviants with something to hide. Data sharing 
	should not be delegated to an electronic agent unless want to cleanse our 
	life of its political dimension.
 
 Reducing the privacy problem to the 
	legal dimension is worthless if the democratic regime needed to implement 
	our answer unravels. We must link the future of privacy with the future of 
	democracy:
 
 1 We must politicize the 
	debate about privacy and information sharing.
 2 
	We must learn how to sabotage the system with information boycotts.
 3 We need provocative digital services to 
	reawaken our imaginations.
 
 The digital right to privacy is secondary. 
	The fate of democracy is primary.
 
	  
		
			|  |  |  |