Phase 3 Final Presentation | Fall 2017

abstact-color-circle-blue3Phase 3 Final

Project Github: https://github.com/parvezk/datavis-phase3

Multi- Foci Layout Visualization of Multimer Dataset:

Screen Shot 2017-12-14 at 6.14.48 PM

(i) Nodes scattered in cluster groups and (ii) Nodes assembled as whole

 

Measuring and interpreting physiological metrics through Sentiments :

Screen Shot 2017-12-14 at 6.15.37 PMOpen-mindedness: Associated with an individual being attracted to a topic, but not alarmed.

Fascination: Associated with a relaxed interest in a topic.

Stimulation: Associated with an individual being more attentive than they are relaxed.

Power / Intensity: Associated with the lasting impact of the experience.

 

 

 

 

Advertisements

Viacom Visit and Project Update

Geospatial Mapping with D3

Screen Shot 2017-12-14 at 7.28.06 PM.png

For the field trip presentation, I explored the latest advances  with Geospatial mapping, and concepts behind geospatial data and the world of cartography, which has been around for a long time.

D3 is a powerful visualization library with a ton of uses. It can be used for much more than just DOM manipulation, or to draw charts. D3.js is extremely powerful when it comes to handling geographical information and building a useful and informative web map,

Screen Shot 2017-12-14 at 7.28.12 PM.png

I also analyzed some of the latest tools and frameworks available to visualize geospatial data such as D3 Maps and Uber Deck.gl.

Phase 3 Process Update 1

I spent the last two week in planning and discovery phase of the phase 3 project. It started with identifying the best visualization form to extrapolate the study findings from AP-Multimer research study done by the community partner. (based on data collected)

I did the initial analysis of the three datasets received (EEG, heart rate, and motion) from the community partner. These datasets have been cleaned, parsed and analyzed using Python analysis libraries by the Multimer team. Initially, there were a copious amount of data and large file size when were broken down to the small chunk of files and optimize it.

I also worked on setting up the development workflow for the final project. I would be using Webpack and npm scripts and Node server to bundle the application code and related assets.

Based on the technology stack selected, D3 will be the choice of standard visualization library. I am planning on integrating this with the React ecosystem for several reasons.I picked up this approach because it supports component creation framework that lets you build self-contained elements (like div or svg:rect) that have custom rendering methods, properties, state, and lifecycle methods.

My approach to the visualization style would be a data dashboard that provides users with multiple perspectives into the data as well as the ability to filter between different categories and see individual data points. The major component would be a force directed graphs that show the correlation between different sentiments for each story type. I am currently exploring various features and limitations of the force directed graphs and what steps are required to transform the final project data suitable for this type of visualization. From an interaction standpoint, the dashboard would include category menu, form filters so that user can compare visualization for different story videos. It would also include tooltip overlays to provide additional information about the survey results.

Good Visualization Vs Bad Visualization

luke-michael-27050-e1513212823555.jpg

As part 2 of the Data Visualization project, we were tasked to analyze good and bad form of visualization. To visualize the data in the most meaningful way, that can help journalist make decisions when integrating strategies and tools to produce and distribute immersive media content. The data can serve as primary guidance here.

With Immersive, non-linear storytelling techniques have many facets, visualization will help them make judicious when deciding how they want to implement it in a newsroom. Visualization can provide insight and guidance in selecting suitable stack of technologies.

kelowna-data-visualization-hiilite-venngageThe range band for the current dataset is recorded with time stamp for different sessions Participants, over the course of the session were shifted between three stations and sessions were scheduled over three 2.5 hours session.

It makes sense to choose a visualization type that can represent the range of lows and highs of Attention, Relaxation and Heart-rate per min for each sentiment.

 

 

Line Graph

stock-vector-business-finance-management-infographics-doodle-hand-draw-elements-concept-graph-chart-pie-200611625

Line Graph is best suited to reveal the trends or progress over time, show acceleration (or decelerations) of sentiments recorded in this case. We have a continuous data set of timestamp and by using Line Graph, we can display how data changes over time for each line representing Attentions, Relaxation and Heart-rate.

Line Graph is a good candidate, simple and intuitive for comparing one or many value sets, and easily able to show the low and high values of sentiments expressed while watching the story videos. “This is especially true when there are multiple emotions to compare” Says Ritchie S. King,author of Visual Storytelling with D3: an introduction to Data Visualization. Further, he adds, “it is a better option because it provides more help to our eyes when we try to figure out how a value is developing over time”.

Scatterplot

xcharts

Scatterplot  rendering will help understand the distribution of your data, to understand outliers, the normal tendency, and the range of emotions in your values. They display relationships in how data changes over a period of time. when visualizing a time series.

Both line graph and scatter plot are good candidate to visualize how a data set performed during a specific time period. To outline the trends of power/intensity between different VR devices, scatterplot will depict the distance and tension between these and understand the relationship between value sets. Scatterplot and Line graphs also suited to showing how one variable relates to one or numerous different variables.

Stacked Column Chart

stacked-column-chart-age-of-new-customer-by-quarter

Stacked charts handle part-to-whole relationships. This is when you are comparing data to itself rather than seeing a total – often as percentages. In this case where attention and relaxation are expressed in terms of percentage. Heart-rate can be one column set comprising 4 sentiments captured.

Screen Shot 2017-11-09 at 4.23.54 PM

In the above data, 44% of Open mindedness, 57% Fascination, 50% Stimulation and 44% Intensity signifies the differences between sentiments for Attention.

The numbers we are working with are relative only to our total. part-to-whole relationship and for this, we use a stacked bar graph. With proper spacing, we see each quarter clearly and the color coding shows that overall.

New York Hall of Science Visit

 

IMG_4620.JPG

This week I visited New York Hall of Science with Alex, Brian, and Will. The Connected World and Mathematica exhibit that we visited were interesting. The Connected World is an animated environment that are immersive and environment responds to your actions – gestures, movements, and decisions – impact how well the world is kept in balance.

IMG_4628

The ecosystem of connected world has six environments juxtaposed at one space: jungle, desert, wetlands, mountain valley, reservoir, and plains

Each has its own trees, plants, and animals, but they share a common supply of water from the waterfall projected several feets high and flows out across an interactive floor that connects these environments. As visitors, you can explore and interact with these features.

The environment reads your actions – gestures, movements, and decision and interprets them with both the short and long-term effects. This helps one understand the interconnectedness of different environment and see how individual and collective actions of humans can have a widespread impact on nature.

IMG_4663

The second exhibit Mathematica was more scientific and informative in nature. The exhibit highlighted the significance of mathematical concepts and its application in theory. One great exhibit was the interactive experiment for what Kepler helped explain the further away the object was from the pull of gravity, the slower their orbits would be. The marbles moved slowly at first and gradually increased speed as they got close to the center.

IMG_4644

IMG_4643IMG_4645

Overall the visit was quite useful and I learned about different ways data is generated and visualized. I also learned the important relationship between mathematics and visualization. including some mathematical concepts that are used for visualization such as golden ration and Fibonacci number.

Phase 1 Development Process

Discovery Phase

The process began with the investigation of the city crime report for 2009 provided by the Office of the Mayor (OTM), and by reading various types of reports made available by other data sources.

It was followed by a comparative study of the report gathered from distinct sources to get a better insight and perspective of the different types of crimes reported in urban cities and identify the important highlights that would be useful for the visualization. It was also imperative to discern the key attributes in the report to help distinguish which information is most important and should be prioritized over the rest. These other key attributes identified in the report such as population based, also helped understand the highs and lows of the crimes at granular level.

Data Inquiry

After reviewing the data, the next step was to finalize the final project data and the right type of data format that would integrate well with the technology stack selected for the project, especially JavaScript and D3. JSON was the chosen format because of its suitability with the REST API and convenience with the data traversal.

Data Preprocessing

Based on the analysis done as part of data inquiry, the redundant data such as id attributes and numbers were edited out and some additional data points relevant for the visualization such as city population were added to the final project data. Here the data was also sorted according to the required order for final visualization.

Training

As per the technology stack selected. It was important to have a fair knowledge of SVG and D3 library to get started with the development process. The D3 documentation was very useful in understanding the key concepts and APIs that will be necessary for the implementation. I went through the online course training on D3 fundamentals and Scott Murray’s book on data visualization for the web. The book was pretty helpful in getting the SVG fundamentals right.

Prototyping

Before the development kickoff, rapid prototyping with static data helped come up with the first prototype. It helped get a clear understanding of basic visualization using D3, SVG and CSS3 and the role each of these technology play and their integration as whole.

Development Workflow

The prototyping phase also helped setup the development workflow and fix early issues like dealing with CORS (cross domain server origin) issues. The basic prototype done served as the starting point for the final development.

The line graph with connected points was chosen as visualization form. The D3’s path and linear scale features were used extensively for the line graph. Constructing the scale, and row and column axis feature were the complex part of the development, given that D3 has some amount of learning curve for this. It also required use math to compute the ranking values and figure out the exact location for the coordinates representing each city in the graph

Presentation and Styling

Besides utilizing style properties in D3 and SVG, CSS3 was used extensively to style the major components of the visualization including the path and shape colors.

Interaction and Behavior

The main interaction component added was the tooltip overlay that provides tidbit of information for the corresponding city. The stack of buttons added at the bottom were used to interact with the visualization such as toggle between rates and line nodes.

Performance Optimization

No any major performance issues were detected. Although the code was code was optimized to reduce the file size of the script that helped reduce the overall payload and speed up the rendering of the visualization in the smaller devices.

Summary

The text content were added below the visualization explaining the key factors of the visualization and other information pertaining to the dataset used.