A scatterplot reveals the relationship between two raster outputs over the same area of interest, pixel by pixel. Each pixel that overlaps both rasters becomes one point on a 2-D plot, one axis per raster. The platform renders this as a density heatmap (lighter cells = more pixels in that combination) using a viridis or magma colour ramp. You can switch to a sample scatter when outliers matter more than density. Unlike downloading two layers and plotting them in QGIS, the scatterplot runs directly on Y-Cloud against the cloud-stored COGs — no download, no local processing, results in 1–3 seconds for typical AOIs.
When to Use a Scatterplot #
Use a scatterplot when you want to verify or explore the joint behaviour of two continuous numerical outputs. Common scenarios:
- Spectral feature comparison. Plot two absorption-feature depths against each other to separate mineral populations. Classic examples:
AlOH 2165vsAlOH 2200— separates kaolinite from white mica.MgOH 2340vsAlOH 2200— chlorite vs phyllic alteration.FeOH 1550vsMgOH 2340— Fe-rich vs Mg-rich alteration.
- MTMF quality check. Plot the match-fraction (MF) against the infeasibility (INF) of the same MTMF run. Real matches cluster at high MF / low INF; spurious matches cluster at high INF.
- PCA component pairs. Plot PC1 vs PC2 (or any pair) to see population separation in feature space before deciding which PC to use as your insight.
- Index validation. Check the correlation between two indices computed over the same AOI — useful when a new index gives unexpected values and you want to know if it correlates with something known.
Scatterplots only work for continuous outputs. Classification insights (clf) are not eligible. For those, use a contingency matrix (planned for a future phase, see below).
Opening a Scatterplot #
To compare two different insights over the same AOI (for example, AlOH 2165 from one insight vs AlOH 2200 from another):
- Open Y-Cloud (the navbar entry that lists your insights).
- Enable multi-select on insight rows.
- Select exactly two insights that:
- belong to the same AOI,
- share the same family (continuous / PCA / raw — no classified),
- share the same grid (same CRS, transform and dimensions — the natural case when both come from the same insight pipeline run).
- The Scatter button at the top of the list becomes active. Click it.
If the two insights do not share a grid, the backend returns a clear grids mismatch error before computing anything.
Step-by-Step Guide #
Once the dialog opens, fill in the parameters described below.
1. Axes (auto-filled) #
X and Y axes are pre-populated from the two insights you selected. Each axis carries:
- A label (band name).
- The source COG URI (read-only).
- A band number.
You can swap X ↔ Y from the dialog if the orientation matters for the comparison.
2. Bins #
Default 64 × 64. Available options are 32 × 32, 64 × 64, and 128 × 128.
Rule of thumb:
- 32 × 32 — fast, coarse density. Use for a first look.
- 64 × 64 — default; sweet spot for most AOIs.
- 128 × 128 — when you need to see fine structure (for example, a thin mineral trend through the bulk cloud).
3. Percentile Clipping #
Default [2, 98] — clip the bottom 2% and top 2% of values per axis before computing the bin edges. This stops a single extreme outlier from compressing the rest of the data into one bin.
You can widen to [0, 100] (no clipping) to inspect outliers, or tighten to [5, 95] to focus on the bulk.
4. Return Mode #
Two options, picked from a toggle in the dialog:
| Mode | What it returns | When to use |
|---|---|---|
| Density (default) | A 2-D grid of counts per cell (bins_x × bins_y) rendered as a heatmap. Small payload, fast render. | First-pass exploration. Pattern detection. Multi-population separation. |
| Samples | A uniform random sample of up to 50 000 pixel pairs as a true scatter. | Outlier inspection. Verifying that a faint cluster is not an artefact of binning. |
You can switch modes inside the dialog without re-running the selection — but the backend does need to recompute. The dialog handles the re-request for you.
5. Submit #
Click Run. The dialog displays a spinner while the backend reads the COGs, clips to AOI, samples or reads at full resolution, and computes the density. Typical wall time is 1–3 seconds for AOIs up to ~10 M pixels; larger AOIs trigger automatic down-sampling (see “Approximate Results” below).
Reading the Result #
Density Heatmap (default) #
A 2-D matrix of cells, colored by pixel count. The brightest cells contain the most pixels; the black or dark cells are empty.
Axes are labeled with the band name and clipped to your chosen percentile range. The axis labels show the actual numeric range used (for example, 0.01 — 0.20 for an AlOH 2200 depth).
What patterns mean:
- A single elongated cloud along a diagonal — the two outputs are correlated. Strong positive diagonal = they track each other; strong negative = they are inversely related.
- Two or more distinct bright cells separated by dark space — two or more mineral populations. Each cluster is a different surface type. This is the canonical case for the AlOH 2165 vs 2200 comparison: kaolinite and white mica form two distinct clusters.
- A horizontal or vertical streak — one variable hardly varies while the other does. Often indicates one of the rasters is dominated by a single value (saturation, nodata, low contrast).
- A diffuse cloud with no structure — the two outputs are uncorrelated. Not necessarily a problem; means the variables are measuring independent things.
Samples Scatter #
When you toggle to samples, each dot is one real pixel. Up to 50 000 dots — beyond that the backend caps and warns you in the response.
Dots reveal outliers that density bins hide. Example: a faint streak of 50 anomalous pixels gets averaged into a single dim bin in a density plot, but appears as a clear arc in samples mode.
Metadata Under the Chart #
valid_pixels— total pixels that contributed to the chart (after AOI clip and nodata exclusion).timing_ms— server-side compute time.approximate—falseif the chart used every pixel;trueif the backend down-sampled.approx_reason— when approximate, this says how (overview,downsample,downsample_aggressive, orsparse_sampling).
Approximate Results #
For AOIs above ~10 M pixels the backend automatically reduces precision to stay under the 15 s timeout:
approx_reason | What happened |
|---|---|
overview | The backend used a built-in COG overview pyramid (≥ 16× reduction). Statistically equivalent to the full read; fast. |
downsample | The backend regenerated a lower-resolution view on the fly (nearest-neighbour). Good for visual patterns; may slightly bias rare values. |
downsample_aggressive | Larger reduction for very big AOIs (> 10 M pixels). |
sparse_sampling | Random pixel sampling — used when other strategies still exceed the deadline. |
For most AOIs (< 5 M pixels) you will never see this — the density is computed at full resolution.
Common Errors and What They Mean #
| Message | Cause | Fix |
|---|---|---|
grids mismatch | The two COGs differ in CRS / transform / dimensions. | Use two insights from the same pipeline run, or rebuild one of them with matching grid settings. |
axis family not supported | One of the inputs is a classification (clf) output. | Scatter is continuous-only. For classification × classification, use a contingency matrix (future feature). |
no data on chart | After AOI clip and nodata mask the intersection had < 100 valid pixels. | Check that both insights actually cover the same AOI; widen the AOI; check for nodata sentinel values. |
bins_x × bins_y exceeds 128 × 128 | Asked for too fine a grid. | Reduce bin counts; 64 × 64 is sufficient for most cases. |
Tips and Best Practice #
- Start coarse. 32 × 32 bins, default percentile clipping, density mode — gives you the pattern in under a second. Refine afterwards.
- Density first, samples second. Density tells you where the populations are; samples confirm whether thin features are real or binning artefacts.
- Two clusters at right angles typically mean two independent mineral phases coexist in the AOI — useful for prioritising drill targets.
- Do not compare across very different scales (for example, raw reflectance vs PCA component) — the units differ and the heatmap will not read meaningfully. Stay within compatible families.
- The chart is not a hypothesis test. It tells you what your AOI looks like in this 2-D feature space. Quantitative inference (significance, regression) still belongs in the download → R / Python flow.
Planned Future Features #
- Lasso selection — draw a region on the chart and have those exact pixels highlighted on the map. Reverse direction (map polygon → chart highlight) also planned.
- Contingency matrix — the categorical equivalent of scatter for
classified×classifiedaxes. - Density caching — repeat requests of the same scatter pair will be served from cache.