Central Component for #datadrivenarchitecture: Real-time Insights powered by Reactive Programming #netflix

Can help with:

  • testing
  • debugging
  • security

Solution:

  • log everything

Problems:

  • so much data
  • so many devices
  • not feasible to save to elasticsearch first (real time!)

Stream analysis with reactive programming

In a data driven architecture the processors for the high performance message bus benefit from being written in Rx.

Who use es it?

users of rxjs: netflix, google, facebook and more

Existing Solutions:

Differences

  • Mantis is designed for operational use cases where message guarantee levels vary by jobs. So, some jobs can choose at-most once guarantees while others choose at least once guarantees via Kafka. We are able to saturate the NIC on the servers for operational use case with very little CPU usage.
  • Bulit-in back pressure that allows Mantis to seamlessly switch between push, pull or mixed modes based on the type of data sources
  • Support a mix of long running perpetual analysis jobs along with user triggered short lived queries in a common cluster
  • Since the volume of data to be processed at Netflix varies tremendously by time of day being able to autoscale workers in a job based on resource consumption & the ability to scale the cluster as a whole was a key requirement. None of the existing streaming frameworks provided such support.
  • We wanted more control over how we schedule the resources so we can do smarter allocations like bin packing etc. (that also allows us to scale the jobs)
  • deep integration with Netflix ecosystem that allows filtering event stream at the source of data.
    among others

Sources:

Soccer Analytics – Predicting the value of a forwarder the critical goal score

In the world of soccer forwards are the players that generate the highest transfer fees. Especially forwards that score a large amount of goals. But lets remember the season 2012/2013 when Lionel Messi broke the scoring record from Gerd Müller and the FC Bayern Munich had no player scoring more than 20 goals. But still Munich broke every record there were and won every title as well as beating Lionel Messi’s FC Barcelona 7:0 in two games.
So total goals is a very bad benchmark for the value of a forwarder. Still the job of an forwarder is to score thats why I develop the critical goal score, taking into account importance and difficulty.

Importance: To identify the important game deciding goals I identify all goals that are needed for the team to win or tie. The intuition behind this is that goal 2 to 6 in a 6:0 are much less important than goal 1 to 2 in an in an 2:1.  We then identify point contributed by multiplying the resulting gain in point {1,3} by the amount of critical goals this player scored divided by the number of all critical goals needed to gain the points.

Difficulty: The other dimension is how difficult it is to score a goal against the opponent. A goal that resulted in winning against FC Bayern Munich should could more than the goals that help you win against Werder Bremen. Because the defence of FC Bayern is much harder to play out than the defence of Werder Bremen. As were using data from the past we can simple use the goals scored against this team divided by sum of all goals scored against every team at the of the year.

This value is better than the overall goals scored but still this is looking into the past. To predict the future value of a player you should use the development in this score and the correlation to other “internal values” fitness test scores, weight, speed and psychological profiles. Then you can can live predict how the development of this score continuous. This is necessary because the past is not a good indicator for the future!

Finally this number can be used to identify the real value of buying a player. Therefore you multiply the value the average point gained by this player with the money this points will generate in revenue from the league and tv plus his merchandising/marketing value. Then we use the NPV and Real Option Approach for valuing the opportunity to buy a player out of his contract.

Date Converter: Transform a date into different formats

One very common data transformation is the stadardization of dates. This Date Converter Code can be customized to your need. You can use foreign languages or change the order of day, month and year. Also you could include other standardizations like ’01’ and ‘1’.
[sourcecode language=”python”]# Date Converter
# Write a procedure date_converter which takes two inputs. The first is
# a dictionary and the second a string. The string is a valid date in
# the format month/day/year. The procedure should return
# the date written in the form <day> <name of month> <year>.
# For example , if the
# dictionary is in English,
english = {1:"January", 2:"February", 3:"March", 4:"April", 5:"May",
6:"June", 7:"July", 8:"August", 9:"September",10:"October",
11:"November", 12:"December"}
# then  "5/11/2012" should be converted to "11 May 2012".
# If the dictionary is in Swedish
swedish = {1:"januari", 2:"februari", 3:"mars", 4:"april", 5:"maj",
6:"juni", 7:"juli", 8:"augusti", 9:"september",10:"oktober",
11:"november", 12:"december"}
# then "5/11/2012" should be converted to "11 maj 2012".
# Hint: int(’12’) converts the string ’12’ to the integer 12.
def date_converter(dic, string):
first_split = string.find(‘/’)
month = string[0:first_split]
second_split = string.find(‘/’,first_split+1)
day = string[first_split+1:second_split]
year = string [second_split+1:]

month_name= dic[int(month)]

return day+’ ‘+month_name+’ ‘+year
print date_converter(english, ‘5/11/2012’)
#>>> 11 May 2012
print date_converter(english, ‘5/11/12’)
#>>> 11 May 12
print date_converter(swedish, ‘5/11/2012′)
#>>> 11 maj 2012
print date_converter(swedish, ’12/5/1791’)
#>>> 5 december 1791[/sourcecode]