Making it possible to debug and introspect data transformations #4134

tmcw · 2018-08-17T17:24:23Z

It seems like the idea of transforms built into vega-lite is that you might be able to use a wider range of input data without writing custom JavaScript, by using the little transform functions. Which are cool, but - given that what you have then is something like data → (transformed data) → visualization result, afaict there's no way to take a peek at the in-between transformed data.

Looking at the transformed data - as it stands - seems quite difficult, for a number of reasons. To get the viewpoint, you need to:

Write a vega-lite schema
Compile it to vega
Run it with vega
Call .data(name) on the produced chart

This is tricky because:

The vega-lite documentation, unlike the vega documentation, doesn't have a Debugging section. You should use the vega debugging section with vega-lite, but it isn't directly linked. The relationship of the spec and implementations is, well, rather confusing.
Vega-lite specs don't necessarily name their datasets, but vega requires you to provide a name when you call .data(name). Calling .data() throws an error. The only way to get this to work, afaict, is to look at the intermediate compiled vega spec, which, if you've been living on the vega-lite abstraction level, might not be very familiar. The dataset names are also not guessable, one cannot run, for example, .data(0)

What might make this better:

Documentation that makes the relationship between these tools much clearer, and treats visualization as a debugging exercise. My experience and what I've heard from others is that, well, debugging is currently a real hard part of the Vega ecosystem, and it doesn't have to be that way.
Maybe a debug output mode for vega-lite?

The text was updated successfully, but these errors were encountered:

domoritz · 2018-08-17T20:03:06Z

Thank you Tom for writing up these challenges that users face when debugging Vega-Lite!

The related issue on naming datasets is #3789.

domoritz · 2018-10-03T01:26:53Z

The Vega-Editor now has a data viewer, which addresses some of the points you mention. The next step should be more thorough documentation of how Vega-Lite is translated to Vega and a debugging guide.

onetom · 2020-04-29T05:55:18Z

I'm struggling with this too.

After seeing that I can name my data sources and after looking at generated vega specs, I was guessing that I could just name transforms in vega-lite too, like:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": {
    "values": [
      {"key": "alpha", "foo": [1, 2], "bar": ["A", "B"]},
      {"key": "beta", "foo": [3, 4, 5], "bar": ["C", "D"]}
    ]
  },
  "transform": [{"flatten": ["foo", "bar"], "name": "with_bar_flattened"}],
  "mark": "circle",
  "encoding": {
    "x": {"field": "foo", "type": "quantitative"},
    "y": {"field": "bar", "type": "nominal"},
    "color": {"field": "key", "type": "nominal"}
  }
}

So in the vega editor, I would see with_bar_flattened instead of the current data_0.

I'm learning vega-lite through https://github.com/metasoarous/oz though, so I'm hoping I can figure out some way to conveniently debug from a REPL. I haven't looked into https://github.com/jsa-aerial/hanami yet; that looks promising too.

domoritz · 2020-04-29T06:04:11Z

Thank you for the feedback @onetom. Unfortunately, there is no one to one correspondence between transforms and Vega datasets. Multiple transforms can appear in a single Vega dataset. So naming datasets this way is not possible.

domoritz added the Enhancement 🎉 label Oct 3, 2018

kanitw added the Area - Data & Transform label Dec 11, 2019

tmcw closed this as completed Dec 26, 2020

saulshanabrook mentioned this issue Aug 30, 2021

Add Runtime Dataflow Viewer vega/editor#1023

Merged

16 tasks

venkateshpotluri mentioned this issue Sep 16, 2021

read transformed data from Vega view object after running the data through the data flow make4all/psst#4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making it possible to debug and introspect data transformations #4134

Making it possible to debug and introspect data transformations #4134

tmcw commented Aug 17, 2018

domoritz commented Aug 17, 2018

domoritz commented Oct 3, 2018

onetom commented Apr 29, 2020

domoritz commented Apr 29, 2020

Making it possible to debug and introspect data transformations #4134

Making it possible to debug and introspect data transformations #4134

Comments

tmcw commented Aug 17, 2018

domoritz commented Aug 17, 2018

domoritz commented Oct 3, 2018

onetom commented Apr 29, 2020

domoritz commented Apr 29, 2020