Inhalt des Dokuments
Financial Big Data Analysis Algorithms & Architecture
The rising amount of big data, such as user-generated or social interaction data has led to a “Big Data Revolution”. YUKKA Lab AG develops solutions that enable the analysis of finance-related big data in real-time. However there are different challenges associated with this: What data should lead to what conclusions? What is the meaning of a particular kind of data? What kind of recommendations can be derived? How should big data be stored in order to enable the realization of different kinds of analyses efficiently? How should data be reanalyzed once the algorithms change?
In order to answer these questions, novel methods have to be found that fall into the following topic areas:
- Developing new algorithms in order to analyze financial big data using rule-based approaches, graph processing and machine learning based on the Apache Spark Framework.
- Developing new architectures to store data, so that it can be efficiently used by algorithms. This includes reducing access times by e.g. defining structures that enable the retrieval of the important information efficiently.
- Algorithms as well as data structures needed by these algorithms may change. At the same time, they potentially need to be reapplied to data that is already existing as well as to new data. Therefor another challenge is to manage the lifecycle of algorithms and their application to data.
- choose their thesis topic from one of the above feature groups,
- decide whether they want to work more on the conceptual/scientific or implementation level (usually it will be a mixture of both).
From YUKKA Lab AG’s side access to financial experts, real-world financial big data and the already existing analysis infrastructure is available to students. From TUB’s side for testing and evaluation proposes, our 200-node cluster (each node: Quadcore Xeon @3.3 GHz, 16 GB RAM, 3 TB RAID0) is available. Thesis language: German or English.
Prerequisites: good programming skills in Java, general interest in distributed systems, data analysis and optionally finance. The feature groups can be adjusted to better match the individual interests and skill level. If desired, topics may be shared among multiple students. Students will be supervised by experts from YUKKA Lab AG and scientists from TUB/CIT.