Big Data News December 12, 2012

  • Santa Claus & The Data-Driven Christmas

    During a recent deep dive into some holiday marketing and retail research, it occurred to me that Santa’s job at the North Pole must generate a substantial amount of data. The question I came to was this: did he even know it? If so, was he using the data to improve distribution procedures and global delivery operations on Christmas? Read this Q&A interview to learn about the Jolly Old Elf’s big data operations.

  • Big Data Comes of Age

    Everyone is talking about big data, but here is a research report that gives facts on what is really going on with big data. EMA Research & 9Sight Consulting conducted a study of business and IT professionals, inquiring about their big data plans and implementations. The report details the obstacles, strategies and approaches identified by the respondents.

  • Seek the grail up the Knowledge Pyramid, not down

    In following the big data ‘buzz’ and trends, it appears that there is a disconnect between our analytical goals (i.e., the types of questions our customers are trying to answer) and the computational substrate on which we build in order to answer them.

    NoSQL technologies, while being far more…

  • Data Scientist: Sexy Is as Sexy Does

    What’s sexy about data science? It has been dubbed the “sexiest occupation” of the 21st century, but you don’t see hordes of autograph-seekers and paparazzi flitting around many data scientists. James Kobielus looks at why data science is hot.

  • The Big Data Analytics Landscape

    As a technologist and evangelist working in the big data marketplace it is certainly exciting. I am excited by the new products we are bringing to market and how this new functionality really helps to bridge the gap for Enterprises adoption. It is also surreal, in terms of the number of blog posts, tweets on Big Data and there seems to be a new big data conference cropping up on a weekly basis across Europe 🙂



    It is interesting to monitor other vendors in the…

  • Going Beyond the Buzz on Big Data

    Everyone is talking about big data, but if you want to see some REAL data on what is going on with big data, here is a new research report you will appreciate. EMA Research & 9Sight Consulting conducted a study of business and IT professionals, inquiring about their big data plans and implementations to learn what is really going on with big data. The report details the obstacles, strategies and approaches identified by the respondents.

  • Can Big Data Analytics learn some lessons from fraud detection software?
    Credit card fraud detection analytics has worked in banking-and its history yields lessons that should be applied to Big Data.
  • 8 TESTS TO DECODE BUSINESS ACCUMEN OF A DATA SCIENTIST

    A data scientist at Flutura has to wear multiple hats in order to deliver next generation analytical solutions in the sectors we operate in namely energy, telecom, digital and health care industry. In order to do that he/she has to wear 3 hats

    –         The BUSINESS  hat

    –         The MATH hat

    –         The DATA hat

    Most of the time it’s easy to fathom the depth of the data scientists math / algorithmic knowledge and the depth of…

Prediki all-purpose prediction

An intersting Start Up in the Big Data Prediction field.

http://gigaom.com/europe/predikis-all-purpose-prediction-promises-attract-austrian-government-cash/

Accurate predictions have been a tantalizing prize since the days of the soothsayers, but these days the business is getting more technical, from collaboration-centric financial forecasting techniques to Nate Silver’s data-driven political predictions . Of course, those two examples are pretty different, but an Austrian firm called Prediki is now trying to come up with a tool for general-purpose predictions.

It’s still all relatively stealthy – that link above won’t tell you much – but the company has nonetheless announced $650k in seed financing from the Austrian government-funded Federal Promotion Bank.

In a statement, CEO Hubertus Hofkirchner said Prediki’s patent-pending technology would be able to “unveil information about the future where traditional market research and opinion survey instruments have proven unreliable or inapplicable”, for clients ranging from companies to governments.

Hofkirchner has form in this business. He was formerly CEO of a company called Redmonitor, which dealt in financial predictions and sold out to CMC Markets , a UK-based derivatives dealer. However, Prediki’s technology has also evolved out of systems that have been used for political polling.

“In the past, the base technology upon which we’re building was mostly used for political forecasts,” Hofkirchner told me. “In recent times it’s probably twice as good as opinion polls for a fraction of the cost. But it was very hard in the past to apply the technology to anything else.

“There has been lots of work done to apply the technology to things like sales forecasts, pharmaceutical approval forecasts, technology adoption, evaluating innovations, evaluating media campaigns and so on – lots of work has been done by various players in the last 10 years. But, while there has been success in the predictive performance of these things, nobody ever cracked the problem that it’s really hard to come up with a model for a truly generic all-purpose prediction market.

So does Prediki’s technology work? What am I, a fortune teller? But the Austrian government seems to have some faith in it (after all, it can apparently be used for NGOs and even governments), so let’s see. The big unveiling should take place sometime in the first quarter of next year.

Big Data News December 7, 2012

  • Podcast Preview of Big Data Analytics Report

    How are organizations approaching big data? What challenges are they experiencing? What are the commonalities in big data projects across industries and geographies? These questions and more are answered in a podcast with a lead researcher on the report “Analytics: The real-world use of big data.”

  • Addressing the Big Data Skills Gap

    Closing the big data talent gap requires tackling the problem from both sides: the people and the technology. Adequately training the data scientists of tomorrow is an obvious and necessary step, but what about the non-data scientists? And what about the technology side? What can we do to make the technology more accessible to the people? If companies are saying that they don’t have the in-house skills to do something with big data, then doesn’t that imply that the existing big data technologies are just too complicated?

  • Beijing Spirit Leads Enterprises to Continuous Progress

    A country needs great national spirit and so does a city. That’s the reason that Beijing, the capital of China announced “Beijing Spirit” on 2nd, November, 2011. Beijing Spirit includes Patriotism, Innovation, Inclusiveness and Virtue. This is the summary of the spiritual wealth formed in the development and practice of Beijingers. It has become a guide to Beijing citizens’ practice since then. As an advanced local enterprise in Beijing, Raqsoft integrates Beijing Spirit into its…

  • Get Rid of Mistaken Thoughts in OLAP

    OLAP is a type of BI software that emerged and gradually developed 20 years ago. OLAP can be used to handle the complex computation flexibly and rapidly according to the requirements of analyzers and present the result to the decision-makers in an intuitive and understandable style. The decision-makers can thus grasp the enterprise operating status accurately, understand the object requirements, and set the right scheme.



    The original intention of OLAP is the arbitrary…

Date Converter: Transform a date into different formats

One very common data transformation is the stadardization of dates. This Date Converter Code can be customized to your need. You can use foreign languages or change the order of day, month and year. Also you could include other standardizations like ’01’ and ‘1’.
[sourcecode language=”python”]# Date Converter
# Write a procedure date_converter which takes two inputs. The first is
# a dictionary and the second a string. The string is a valid date in
# the format month/day/year. The procedure should return
# the date written in the form <day> <name of month> <year>.
# For example , if the
# dictionary is in English,
english = {1:"January", 2:"February", 3:"March", 4:"April", 5:"May",
6:"June", 7:"July", 8:"August", 9:"September",10:"October",
11:"November", 12:"December"}
# then  "5/11/2012" should be converted to "11 May 2012".
# If the dictionary is in Swedish
swedish = {1:"januari", 2:"februari", 3:"mars", 4:"april", 5:"maj",
6:"juni", 7:"juli", 8:"augusti", 9:"september",10:"oktober",
11:"november", 12:"december"}
# then "5/11/2012" should be converted to "11 maj 2012".
# Hint: int(’12’) converts the string ’12’ to the integer 12.
def date_converter(dic, string):
first_split = string.find(‘/’)
month = string[0:first_split]
second_split = string.find(‘/’,first_split+1)
day = string[first_split+1:second_split]
year = string [second_split+1:]

month_name= dic[int(month)]

return day+’ ‘+month_name+’ ‘+year
print date_converter(english, ‘5/11/2012’)
#>>> 11 May 2012
print date_converter(english, ‘5/11/12’)
#>>> 11 May 12
print date_converter(swedish, ‘5/11/2012′)
#>>> 11 maj 2012
print date_converter(swedish, ’12/5/1791’)
#>>> 5 december 1791[/sourcecode]

Reducing text to it’s components

This short phyton programm takes a Webpage as an input and reduces it to it’s components. The components are the words on the webpage. You can use this and customize this to fit your purpose. This code can be applied in web-crawlers, text analytics and other fields. For example if you want do leave out stop words you would define a dictonary of this word and include this with anouther if statement. This could be applied if you want to reduce patent data to it’s components and leave generic terms like ‘a’ ‘this’ ‘innovation’ etc. out. You would do this because words like this have no information value.

[sourcecode language=”python”]

def remove_tags(source):

output = [ ]

atsplit = True

splitlist = [‘ ‘,’>’,'<‘,’n’]

i = 0

while i < len(source):

if source[i] == ‘<‘:

i = source.find(‘>’,i+1)

if source[i] in splitlist:

atsplit = True

else:

if atsplit:

output.append(source[i])

atsplit = False

else:

output[-1] = output[-1] + source[i]

i = i + 1

return output[/sourcecode]

 

Verwandte Artikel: