direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Big Data Visualization and Integration of Analytics Components (Topic Area/Multiple Topics)

Lupe

The rising amount of big data, such as user-generated or social interaction data has led to a "Big Data Revolution". One of the main challenges in this context is the visualization of Big Data. Naturally, Big Data is too big to fit on a single screen – and even if, such large amounts of data could not be understood by the user, if all displayed at once. This leads to subsequent questions like: What subset of data is relevant to the user? How should it be chosen? What options are relevant to browse and display the subset? How can this be realized technically in a scalable and user-oriented manner? How are analysis results presented in an intuitive and insightful way?

The aim of the theses topics in this area is to help answer these questions by extending Graphr Visualizer, a visual big data analysis tool developed at CIT, and integrating it with different already existing analytics components. The Graphr Visualizer is based on an MVC architecture, where the view part is situated at the client (main technology: d3js.org), while the model and the controller are on the server side (main technologies: graphr/Java/web services). Graphr Visualizer is designed to be powerful and scalable, but also highly interactive and user-friendly. 

Thesis topics are defined based on the following feature groups:

  • Integrating and visualizing the results of already existing feature extraction approaches, such as http://cs.stanford.edu/people/karpathy/deepimagesent/ 
  • Finding and visualizing structural patterns in media, such as social networks, locations or cliques
  • Enhancing the usability of Graphr Visualizer
  • Modifying the force-based layout to support a more natural visualization of relevant patterns, such as paths and clusters; providing visual means to select and store such patterns

Students can…

  • choose their thesis topic from one of the above feature groups,
  • decide whether they want to work more on the conceptual/scientific or implementation level (usually it will be a mixture of both). 

For testing and evaluation proposes, real-world big data, corresponding use cases and analysis results, as well as, our 200-node cluster (each node: Quadcore Xeon @3.3 GHz, 16 GB RAM, 3 TB RAID0) are available to the students. Thesis language: German or English. Prerequisites: good programming skills in Java or JavaScript, general interest in distributed systems, data analysis and visualization. The feature groups can be adjusted to better match the individual interests and skill level. If desired, topics may be shared among multiple students.

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions

Contact

Dr. Peter Janacik
+49 (30) 314-25397
Room E-N 103