# Microplastics ML/CV for Aquatic Monitoring — Full Report

**Generated:** 2026-05-21  

**Format:** Markdown (open in any editor; print to PDF from VS Code, Pandoc, or browser)


This document is the **full evidence package** for the Labs 7Lineas article
*"What ML Papers Actually Show About Microplastic Detection in Water (2019–2025)"*.

It combines:

1. A **plain-language guide** (topic, purpose, pre-conclusions, how to share)
2. The **systematic map manuscript** (`slr.md`)
3. **Regional and Colombia** synthesis documents
4. **Gap inventory**, **open questions**, and **PRISMA** counts
5. **All structured claims** and the **extraction table**
6. **Protocol** and **metrics** reference material

**Type of study:** Systematic **map** (not a pooled meta-analysis).  
**Search lock:** 2026-05-18 · **Source:** OpenAlex · **Corpus:** 228 unique works.

**How to cite (informal):** 7Lineas Labs (2026). *Microplastics ML/CV Aquatic Monitoring — Full Report.*
Generated from the notebook research factory `microplastics-ml-detection-slr`.

---

## Plain-language summary

### What is this research about?

Microplastics are plastic particles typically under 5 mm. They are widespread in rivers, coasts, and wastewater. Counting and identifying them with traditional lab methods is slow and expensive. Many papers claim that **machine learning** and **computer vision** can automate detection using cameras, satellites, Raman/FTIR spectroscopy, and similar tools.

This map asks a practical question: **what does the published literature actually show**, and **what can you trust** if you want real monitoring—especially where budgets and labs are limited (Global South, Colombia, Magdalena/Caribbean context)?

### Why we did it (purpose)

1. **Map the field** — Chart which methods exist, what metrics they report, and where deployment breaks down.
2. **Cut through hype** — Headline "98% accuracy" often means macro litter, satellite debris, or lab spectroscopy—not a cheap sensor in a turbid river.
3. **Support decisions** — For policy, utilities, or pilots: what is defensible to fund, what needs local replication, what **not** to buy from abstracts alone.
4. **Colombia / LATAM lens** — Check local field validation; in this corpus there is **no** obtained primary for Colombia aquatic ML/CV microplastics monitoring.

### Pre-conclusion (what the evidence supports)

**One sentence:** Transfer to resource-limited and Latin American monitoring requires **separating macro-litter CV from MP-specific ID**, reporting **precision and mAP together**, and using **tiered architectures** (surveillance + lab confirmation)—not headline accuracy alone.

**Six defensible claims:**

1. **Lab spectroscopy + ML** — High polymer-class accuracy on prepared fractions; high capex and prep (μFTIR, µ-Raman, Raman).
2. **Microscopy / flow imaging** — Useful on controlled matrices; limits must be stated.
3. **Field MP vision** — Emerging; one strong full-text example has usable precision but **poor mAP** (~34–36%).
4. **Macro litter CV** in Global South rivers/coasts is **real but not a substitute** for MP policy.
5. **Satellite / AUV debris** — Valid for **litter surveillance**, not sub-mm MPs.
6. **In-situ aquatic MP sensing** — Mostly review/prototype vs lab spectroscopy.

**Tiered monitoring (hypothesis, not one product):**

| Tier | Role |
| --- | --- |
| 0 | Prevention / source control |
| 1 | Macro litter surveillance (drone, satellite) — separate KPI |
| 2 | Field screening (UV/RGB) — report mAP + precision |
| 3 | Lab confirmation hub (µ-Raman / μFTIR on subsets) |
| 4 | Research pilots only |

**Colombia:** Do not procure citing Colombia-validated ML/CV MP performance from this map. Treat transferable methods as **pilot hypotheses** with explicit transfer risk.

### What this document is not

- Not peer-reviewed journal output without a human edit pass
- Not pooled effect sizes across studies
- Not proof that no Colombian research exists globally—only that this OpenAlex map found no obtained field primary

### How to share this work

| Channel | Artifact |
| --- | --- |
| Labs article | https://labs-7lineas (article + this download) |
| Full map | `slr.md` in notebook repo |
| Interactive | Evidence explorer HTML in research `outputs/products/` |
| Preprint | Export this file or `slr.md` to Zenodo/OSF |
| Journal | Trim `slr.md`, add dual screening, register protocol |

---


# Part I — Systematic map manuscript


# Systematic map: ML and computer vision for microplastic detection in aquatic matrices (2019–2025)

**Status:** Phase 9 in progress — §6–7 regional/deployment sections drafted (9.3–9.4); remaining LATAM spine items pending.

## Abstract

Microplastic pollution in rivers, coasts, and wastewater drives demand for automated detection, yet reported machine-learning (ML) and computer-vision (CV) performance is fragmented across incompatible metrics and scales (e.g. W4391755619 precision vs mAP; W4291123479 macro-litter task vs MP intent). We systematically mapped **228** studies (2019–2025) on ML/CV for aquatic microplastic detection retrieved via OpenAlex, screened all title–abstract records, and forwarded **116** to full text; **31** open-access full texts were obtained and synthesised with **37** structured extractions and **61** evidence-linked claims.

Primary evidence clusters into **vibrational spectroscopy plus ML** (μFTIR, Raman, µ-Raman) with high laboratory classification accuracy on environmental matrices (e.g. μFTIR random-decision-forest accuracy 0.9766, κ 0.9690, W4200249418; Raman nanoplastic RF accuracy 98.8%, W4382931577) but heavy capital cost and limited in-situ deployment. **RGB object-detection** papers often target **macro litter or marine debris**, not polymer-level microplastics in water—e.g. riverine YOLO mAP 89% on India urban solid waste (W4291123479) versus Thailand UV Faster R-CNN with high precision (85.5–87.8%) but low mAP (33.9–35.7%) on field MP boxes (W4391755619). Satellite Sentinel-2 workflows report scenario-dependent accuracy up to 98% for floating debris at resolutions unsuitable for sub-millimetre MPs (W4380082849, W4205835860).

Reviews emphasise automation gaps, standardisation needs, and rising interest in **in-situ** aquatic monitoring (W4396828529, W4409887007) while the harvested corpus shows few field-validated, low-cost aquatic CV pipelines: only one of four extraction rows flagged “edge/low-cost” has obtained full text (polarization holographic flow-through imaging, W4391319604). **Global South** field studies with trained vision models appear in India, Thailand, Cambodia, and China (W4291123479, W4391755619, W3091414454, W3204790372), but **no obtained primary reports a Colombia field programme** for ML/CV microplastic monitoring (W4392657594 affiliation only).

We conclude that transfer to resource-limited and Latin American monitoring requires separating macro-litter CV from MP-specific ID, reporting detection metrics completely (e.g. precision and mAP together), and prioritising open benchmarks, field replication beyond small-n demos (W4383534319), and tiered architectures that pair affordable imaging with spectroscopic confirmation—not headline accuracy alone.

## 1. Introduction

Microplastics (typically &lt;5 mm) are widespread in rivers, coastal waters, and wastewater effluents, driving regulatory interest and a large body of analytical methods—from microscopy and vibrational spectroscopy to remote sensing (W3134265767, W4296114416). Manual counting and library-based identification remain slow and operator-dependent; reviews of aquatic micro/nanoplastic occurrence and mitigation note heterogeneous methods and limited spatiotemporal coverage (W4210266455). Machine learning (ML) and computer vision (CV) are increasingly proposed to accelerate detection, classification, and quantification, including bibliometric evidence that imaging-focused AI research grew rapidly through 2022 (W4313826580).

This report is a **systematic map**, not a claim to be the first review of microplastic pollution. Our scope is narrower and operational: **which modalities and model families are used for ML/CV-based microplastic detection in aquatic matrices (2019–2025), what metrics are reported, and what limits field deployment**—with explicit attention to resource-limited and Global South contexts where capital equipment, trained staff, and open data are scarce (W4296114416, W4318615471).

### 1.1 Motivation

Policy and research communities need comparable evidence to choose monitoring stacks. In practice, published studies report **incompatible metrics**—spectral classification accuracy, object-detection mAP, segmentation mIoU, and particle-count error—often on different size classes (macro litter versus true microplastics) and matrices (lab digestate versus in-situ water). A Thailand field study illustrates the risk: high **precision** on UV-excited polymer boxes coexists with low **mAP** when small objects are missed (W4391755619). Satellite studies report high pixel-level accuracy for floating debris at **10 m** resolution that cannot resolve sub-millimetre MPs (W4380082849, W4205835860). Without a map that groups methods and flags tensions, practitioners may over-interpret headline numbers.

Reviews also highlight a growing **in-situ** monitoring literature (W4396828529) while many obtained primaries in our corpus remain **laboratory spectroscopy** or **macro-litter CV** (W4200249418, W4291123479). Closing that gap matters for rivers and coasts in Latin America and other Global South regions where monitoring capacity is uneven and local field validation is often missing from the published record (questions.md Q2, Q9).

### 1.2 Research questions

1. **Modalities:** What sensing pathways (microscopy, Raman/FTIR, hyperspectral, RGB/drone, satellite, microfluidic, etc.) dominate ML/CV microplastic studies in aquatic settings?  
2. **Models and metrics:** Which model families (CNN segmentation, YOLO-style detectors, classical ML on spectra, etc.) are used, and how are results measured?  
3. **Performance:** What performance ranges are reported under verified full-text evidence, and where are metrics `unverified` or abstract-only?  
4. **Deployment:** What barriers (cost, lab workflow, scale mismatch, automation vs expert review) recur—and what evidence exists for **low-cost**, **edge**, or **Global South field** deployment?

### 1.3 Scope and boundaries

We include studies meeting the protocol in [protocol.md](../../protocol.md): aquatic micro/nanoplastics plus ML/DL/CV or automated counting with reported performance or clear methodological limits. We **tag separately** systematic reviews (`is_review: true`) for gap analysis; they do not enter primary performance aggregates (W4296114416, W4409887007). We exclude toxicity-only, policy-only, and food-web papers without a detection pipeline.

Geographic emphasis: we document **Global South** authorship, field sites, and **Colombia** relevance where stated. As of full-text synthesis, **no obtained primary describes a Colombia field programme** for ML/CV MP detection (W4392657594); transferable evidence comes from other regions and must be labelled as such in Section 3.3 and Phase 9.

### 1.4 Contributions of this map

This factory run delivers: (1) a deduplicated OpenAlex corpus (**228** works) with PRISMA-style counts; (2) **37** structured extraction rows and **61** traceable claims in `outputs/knowledge/`; (3) modality and metric guides ([modality-map.md](../knowledge/modality-map.md), [metrics-legend.md](../knowledge/metrics-legend.md)); (4) a gap inventory ([gap-list.md](../knowledge/gap-list.md)) with **22** documented research gaps. The present manuscript synthesises those artifacts for readers who will not inspect the JSONL corpus directly.

### 1.5 Document structure

**Section 2** describes search, screening, and extraction. **Section 3** presents results by modality, metrics, and deployment/Global South gaps. **Section 4** discusses implications for monitoring design; **Section 5** states limitations of the map itself (retrieval bias, paywalled full text, English-centric metadata).

## 2. Methods

### 2.1 Search strategy

We conducted a **systematic map** (evidence synthesis without formal quality scoring of primary studies) of publications on machine learning and computer vision for microplastic detection in aquatic matrices. The protocol is registered in [protocol.md](../../protocol.md); this section records what was executed in the Ralph SLR factory run locked on **2026-05-18** (`manifest.json` → `search_lock_date`).

#### Database and API

- **Source:** [OpenAlex](https://openalex.org/) Works API (`https://api.openalex.org/works`), chosen for open bibliographic metadata, DOI linkage, and `is_review` typing.  
- **Polite pool:** requests included `mailto` per OpenAlex policy (`manifest.json` → `openalex_mailto`).  
- **Record type filter:** `type:article|review` (journal articles and reviews).  
- **Date filter:** `from_publication_date:2019-01-01` (publications from 1 January 2019 onward).  
- **Pagination:** results were retrieved per query until no new `paper_id` appeared; duplicates across queries were skipped at ingest.

We did **not** in this pilot search Scopus, Web of Science, or grey literature; retrieval bias toward English-metadata OpenAlex entries is a stated limitation (Section 5).

#### Search queries

Seven keyword strings were run **separately** (five core protocol strings plus two secondary passes added during harvest to improve FTIR and fluorescence coverage—see spine items 1.8–1.9 in factory progress). Each hit was stored in `corpus/structured/papers.jsonl` with `source_query` set to the string that first retrieved it.

| # | Query string | Records in corpus (first-hit attribution) |
| ---: | --- | ---: |
| 1 | `microplastic detection machine learning CNN` | 65 |
| 2 | `microplastic computer vision deep learning water` | 59 |
| 3 | `microplastic hyperspectral imaging classification` | 23 |
| 4 | `microplastic Raman spectroscopy machine learning` | 22 |
| 5 | `microplastic YOLO detection` | 21 |
| 6 | `microplastic FTIR machine learning` | 20 |
| 7 | `microplastic fluorescence imaging deep learning` | 18 |
| | **Unique works after cross-query dedupe** | **228** |

The deduplicated corpus size (**228** unique `paper_id` values) is the **identified** count used in PRISMA-style reporting (`manifest.json` → `stats.identified`; `stats.after_dedupe`).

#### Deduplication and record schema

- **Key:** OpenAlex `paper_id` (e.g. `W4200249418`); secondary check on DOI when present in `manifest.json` sources.  
- **Required metadata per row:** `paper_id`, `openalex_id`, `doi`, `title`, `publication_year`, `abstract`, `screening_status` (default `candidate`), `source_query`, `harvest_status`, `is_review`.  
- **Schema reference:** [papers-jsonl-schema.md](../../corpus/structured/papers-jsonl-schema.md).

Reviews were **tagged** (`is_review: true`) at harvest and are included in the bibliography for gap mapping but excluded from primary performance aggregates (Section 1.3; W4296114416 as example spectroscopy review).

#### What happens next in the pipeline

Title–abstract screening (Section 2.2), full-text acquisition for forwarded papers (Section 2.2–2.3), and structured extraction (Section 2.3) were applied to this fixed corpus; no new OpenAlex queries were added after the search lock date.

### 2.2 Screening

Screening was two-stage: **title–abstract** on all **228** identified records, then **full-text** eligibility on **116** forwards. Decisions are logged in `corpus/structured/screening-log.csv` (344 rows: 228 TA + 116 FT) with protocol-aligned `reason` text for every exclusion. PRISMA counts match `manifest.json` → `stats` and [prisma-flow.md](../../corpus/structured/prisma-flow.md).

#### 2.2.1 Title–abstract screening

- **Screeners:** single reviewer pass (agent-labelled `ralph-agent` in the log) against [protocol.md](../../protocol.md) inclusion: aquatic micro/nanoplastics plus ML/DL/CV or automated counting with performance or clear method limits.  
- **Excluded (n = 112):** records failing aquatic context, lacking a detection/ID pipeline, or out-of-scope topics (e.g. ecology-only distribution, deep-ocean observing without MP detection ML, tribology/industry 4.0). The most frequent TA exclusion string was *“does not meet inclusion: aquatic MP + ML detection”* (**13** records); other reasons are itemised in `screening-log.csv`.  
- **Forwarded to full text (n = 116):** met inclusion at title/abstract, including **22** works tagged `is_review: true` in `papers.jsonl` (retained for gap mapping, not primary performance tables).

At this stage, **100%** of identified candidates received a TA decision (`stats.screened_title_abstract` = 228).

#### 2.2.2 Full-text acquisition and eligibility

For each forward, we attempted open-access PDF or PMC/HTML retrieval under project access rules (no paywall bypass). Harvest terminated at search lock with **no pending** retrievals (`stats.full_text_pending_retrieval` = 0).

| Stage | n | Notes |
| --- | ---: | --- |
| Full-text articles sought | 116 | All TA forwards |
| Full-text obtained (PDF/PMC/HTML) | 31 | 25 OA PDFs in `corpus/raw/` + additional HTML/text (`stats.oa_pdfs_in_raw`, `stats.full_text_html_obtained`) |
| Full-text not retrieved | 85 | See breakdown below |
| **Included in qualitative synthesis** | **31** | `screening_phase=full_text`, `decision=included` — each has `corpus/summaries/{paper_id}.md` |
| Excluded at full text | 85 | Access barrier only (not re-judged on science after forward) |

**Not retrieved (n = 85)** — from `papers.jsonl` harvest_status / screening reasons:

| Barrier | n |
| --- | ---: |
| OA full text not obtained at harvest lock (deferred) | 40 |
| Paywalled | 30 |
| OA PDF download failed (publisher block / no PDF URL) | 15 |

Examples of high-interest forwards without obtained text: Mediterranean FTIR+ML (W2952839204, paywalled), fish-intestine HSI+SVM (W2936115560, paywalled), 2025 ML+MP systematic review (W4409887007). These remain in `papers.jsonl` with `screening_status: forward_fulltext` for transparency but do not supply verified metrics to extraction.

#### 2.2.3 Included set composition

Of **31** included full texts:

- **9** are systematic or narrative **reviews** (`is_review: true`) — used for gap analysis only (e.g. Raman in water bodies W4296114416, MP identification methods survey W3134265767, imaging-AI bibliometric review W4313826580).  
- **22** are primary-style articles with obtained full text used for performance-oriented synthesis.

Forwards without obtained text—such as the 2024 in-situ aquatic MP systematic review (W4396828529) or the 2025 ML+MP detection review (W4409887007)—remain in the bibliography as `forward_fulltext` but are **not** in the included summary set.

**Structured extraction** (Section 2.3) covers **37** papers: all **31** included summaries plus **six** additional `forward_fulltext` rows prioritised for modality coverage (abstract-only metrics marked `unverified`). **28** extraction rows are primary studies; **9** are reviews.

#### 2.2.4 PRISMA flow diagram

Authoritative stage counts and FT-not-retrieved breakdown are maintained in [`corpus/structured/prisma-flow.md`](../../corpus/structured/prisma-flow.md) (synced with `manifest.json` → `stats` at search lock **2026-05-18**). The diagram below follows PRISMA 2020 staging adapted for a **systematic map**: every obtained full text was summarised; forwards without open access were not dropped from the bibliography but cannot supply verified metrics (Section 5.3).

```text
IDENTIFICATION
  Records identified (OpenAlex; 7 queries; 2019–2025)                  n = 228
  Records after deduplication                                          n = 228

SCREENING
  Records screened (title/abstract)                                    n = 228
    Excluded at title/abstract                                         n = 112
    Forwarded to full text                                             n = 116

ELIGIBILITY (full text)
  Full-text articles sought                                            n = 116
    Full-text obtained (summarised)                                    n =  31
      · 25 OA PDF in corpus/raw/
      · 6 HTML or normalized full text (no PDF in corpus/raw/)
      · 22 primary-style + 9 reviews (reviews: gap analysis only)
    Full-text not retrieved                                            n =  85
      · Paywalled                                                      n =  30
      · OA download failed                                             n =  15
      · OA not retrieved after sought                                  n =  40

INCLUDED (map synthesis)
  Narrative summaries (corpus/summaries/)                              n =  31
  Structured extraction rows (extraction.csv)                          n =  37
  Evidence claims (outputs/knowledge/claims.jsonl)                   n =  61
```

| Stage | n | `manifest.stats` key |
| --- | ---: | --- |
| Identified | 228 | `identified` |
| After dedupe | 228 | `after_dedupe` |
| Screened (TA) | 228 | `screened_title_abstract` |
| Excluded (TA) | 112 | `excluded_title_abstract` |
| Forwarded to FT | 116 | `forward_fulltext` |
| Full text sought | 116 | `full_text_sought` |
| Full text obtained | 31 | `full_text_obtained` |
| Full text not retrieved | 85 | `full_text_not_retrieved` |
| Included / summarised | 31 | `included` |
| OA PDFs in raw | 25 | `oa_pdfs_in_raw` |

Factory targets met: **≥35** forwards (**116**); **≥20** OA PDFs (**25**); **≥120** unique records (**228**). Title–abstract exclusion rationales are documented in [`screening-rationale.md`](../knowledge/screening-rationale.md); paywalled high-interest forwards include W2936115560 (fish HSI), W2952839204 (Mediterranean FTIR+ML), and W4409887007 (2025 ML+MP review).

### 2.3 Data extraction

Structured extraction complements the **31** narrative summaries (`corpus/summaries/{paper_id}.md`) and feeds the knowledge layer (`outputs/knowledge/`). The workflow below was executed in Phase 6 of the factory run; field definitions match [extraction-template.csv](../../corpus/structured/extraction-template.csv).

#### 2.3.1 Extraction table (`extraction.csv`)

- **File:** `corpus/structured/extraction.csv` — **37** data rows (one per extracted `paper_id`), plus header row.  
- **Scope:** Papers prioritised by **modality batches** (microscopy/CV, Raman, hyperspectral, FTIR, YOLO/object detection, wastewater reviews, marine/coastal, freshwater, edge/low-cost, LATAM/Global South flags)—not a complete row for every included full text. **W4308496878** (O-PTIR+Raman, included FT) is synthesised in summaries and claims but not yet a row in `extraction.csv` (questions.md Q7).  
- **Composition (`manifest.json`):** **28** primary-style rows, **9** review/synthesis rows, **8** abstract-only forwards (no obtained PDF at lock), **12** rows with `global_south=yes`.

| Field | Purpose |
| --- | --- |
| `modality` | Controlled vocabulary (e.g. `raman_spectroscopy`, `rgb_object_detection_uv`) — roll-up in [extraction-by-modality.csv](../../corpus/structured/extraction-by-modality.csv) (34 modality groups) |
| `model_type` | Algorithm family (CNN, YOLO, RDF, PLS-DA, etc.) |
| `matrix` | Aquatic or related matrix (river, marine, wastewater, lab digestate, …) |
| `scale` | Size class or task scale (macro litter vs MP, nanoplastic lab, …) |
| `metrics` | Reported performance strings; `unverified` when not confirmed from obtained FT in this corpus |
| `dataset_size` | Sample/image/spectra counts when stated |
| `open_data` | Public dataset release (`yes` / `no` / `unverified`) |
| `edge_low_cost` | Author or corpus flag for portable/low-cost claims (`yes` only with explicit basis) |
| `global_south` | `yes` only with explicit affiliation, field site, or author-stated GS context per protocol |
| `limitation_author` | Author-reported limits (lab-only, capex, small field *n*, etc.) |

Reviews are extracted for **gap mapping** (e.g. W4296114416 Raman water review) but are **excluded from primary performance aggregates** in Results (Section 3.2).

#### 2.3.2 Evidence sources per row

For each `paper_id`, extractors used, in order of precedence:

1. **Obtained full text** — `corpus/normalized/{paper_id}.txt` from OA PDF or PMC HTML.  
2. **Structured summary** — six-section template (Objective, Methods, Data/modality, Metrics, Limitations, LATAM relevance).  
3. **OpenAlex abstract** — only when full text unavailable; metrics then marked `unverified` unless cross-checked.

**Phase 6.11** verified a random sample of **10** primary rows with obtained FT (`random.seed(83)`): all sampled metric strings matched the normalised full text. **Phase 6.12** audited corpus-wide `unverified` usage and corrected mis-tagged fields (e.g. `dataset_size` wrongly set to `forward_fulltext abstract-only`).

#### 2.3.3 Claims layer (`claims.jsonl`)

- **File:** `outputs/knowledge/claims.jsonl` — **61** JSON lines, each with `claim`, `paper_id`, `confidence` (`high` \| `medium` \| `low`), and `notes`.  
- **Batches:** 23 (batch A, obtained-FT primaries) + 18 (batch B, reviews/abstract forwards) + 20 (batch C, gaps and cross-corpus patterns) — factory target **≥60** met.  
- **Rule:** Factual claims in `slr.md` must trace to a `paper_id`; reviews may support **gap** statements only, not primary performance rankings (e.g. in-situ barrier W4396828529; automation bibliometric W4313826580).

#### 2.3.4 Supporting knowledge artifacts

| Artifact | Role in synthesis |
| --- | --- |
| [modality-map.md](../knowledge/modality-map.md) | `paper_id` links by sensing cluster |
| [metrics-legend.md](../knowledge/metrics-legend.md) | Comparability rules (mAP vs accuracy, etc.) |
| [contradictions.md](../knowledge/contradictions.md) | Tension pairs (e.g. W4391755619 precision vs mAP) |
| [gap-list.md](../knowledge/gap-list.md) | 22 documented research gaps (8 P1 for deployment/LATAM) |
| [glossary.md](../knowledge/glossary.md) | Shared terminology |
| [reviews-synthesis.md](../knowledge/reviews-synthesis.md) | Thematic map of `is_review: true` works |

#### 2.3.5 Full-text screening log linkage

Full-text **included/excluded** decisions (31 / 85) are in `screening-log.csv` (`screening_phase=full_text`) and were synced to `manifest.stats` in Phase 6.15. Extraction rows do not duplicate those access outcomes; they record **scientific** variables for papers selected for modality tables.

#### 2.3.6 Limitations of the extraction process

- **Selection bias:** Modality batches overweight spectroscopy and RGB detection forwards; not all 116 forwards have rows.  
- **Metric heterogeneity:** Values are stored as reported strings, not normalised numerics—cross-paper comparison requires [metrics-legend.md](../knowledge/metrics-legend.md).  
- **Abstract-only rows:** Eight extractions lack obtained FT; metrics remain `unverified` or abstract-sourced (W4408220111 Brazil beach, W4408550134 Mexico FTIR, W4392657594 Colombia affiliation, etc.).  
- **No formal risk-of-bias scoring:** This is a **map**, not a GRADE-style review.

Results sections draw on this extraction layer; Section 3.2 applies the primary/review split consistently.

### 2.4 Quality and bias notes

This synthesis is a **systematic map** of the literature landscape, **not** a meta-analysis and **not** a risk-of-bias–assessed systematic review. We did **not** apply RoB 2, QUADAS-2, or PROBAST to individual studies. Readers should treat performance numbers as **indicative ranges** within modality families, not as a league table of “best” methods.

#### Search and selection bias

| Source of bias | Effect on findings | Mitigation in this run |
| --- | --- | --- |
| **OpenAlex-only** pilot | May miss indexed-only records in Scopus/Web of Science; metadata often English-centric | Documented in Section 2.1; seven query strings with dedupe |
| **Keyword retrieval** | Favours papers using “machine learning”, “YOLO”, “CNN”, etc.; may under-represent novel but differently worded methods | Secondary FTIR and fluorescence queries (Section 2.1) |
| **Single-reviewer** title/abstract screen | No κ inter-rater reliability; inclusion boundaries may drift | All decisions logged with `reason` in `screening-log.csv` for audit |
| **Forward-then-access** design | **85/116** forwards lack obtained full text—high-impact paywalled work (e.g. W2952839204, W2936115560, W4409887007) can bias toward OA-accessible labs and journals | Transparent PRISMA counts; abstract-only metrics tagged `unverified` |
| **Modality-targeted extraction** | `extraction.csv` is not a census of all 31 included papers; under-represents rare modalities | `extraction-by-modality.csv` + summaries for included set |

#### Outcome reporting and metric bias

- **Publication bias:** ML and CV studies disproportionately report favourable accuracy, mAP, or F1; negative or null results are likely underpublished (bibliometric imaging review W4313826580 notes rapid growth but limited critical appraisal).  
- **Metric inconsistency:** Authors report accuracy, precision, mAP, mIoU, κ, and count error on **incommensurate tasks** (spectral ID vs macro-litter detection). Headline percentages are not comparable without task context ([metrics-legend.md](../knowledge/metrics-legend.md); W4391755619 precision vs mAP tension in [contradictions.md](../knowledge/contradictions.md)).  
- **Scale conflation:** Satellite and riverine CV papers (W4380082849, W4291123479) are easily misread as “microplastic detection” when labels target **macro debris** or litter management.  
- **Lab versus field:** High performance on cleaned sediment, digestate, or spiked particles (W3003736709, W4382931577) may not transfer to turbid field water—reviews stress in-situ gaps (W4396828529).

#### Technology and geography bias

- **Capital equipment:** Obtained primaries emphasise µ-Raman, μFTIR, HSI, and satellite stacks (W4366815281, W4200249418, W4213300830)—configurations common in well-funded European or East Asian labs, not typical municipal LATAM monitoring budgets (W4296114416, W4318615471).  
- **Global South evidence:** Twelve extraction rows flag `global_south=yes`, but only four obtained-FT vision primaries combine **field geography + trained models** in India, Thailand, Cambodia, and China (W4291123479, W4391755619, W3091414454, W3204790372). **No Colombia field programme** appears in obtained primaries (W4392657594 affiliation only)—limits direct policy transfer (GAP-GEO-01).  
- **Automation narrative:** Reviews describe automation potential (W4404459247); obtained μFTIR workflow still requires expert dual-control (W4200249418)—overstates “fully automated” monitoring if cited without qualification.

#### Process bias (factory / agent extraction)

- Metadata harvest, screening, normalisation, and extraction were executed in an **agent-assisted Ralph loop** with human-defined protocol—not blinded independent dual extraction.  
- **Phase 6.11** random metric verification (10 primaries, `random.seed(83)`) reduces transcription error but does not eliminate interpretive bias in modality assignment.  
- **Claims.jsonl** includes corpus-level patterns (e.g. Global South field-site counts) anchored to representative `paper_id`s; these are synthesis judgments, not single-paper experimental results.

#### How Results and Discussion use this lens

Section **3** reports descriptive clusters and cites `paper_id`s without ranking studies. Section **4** interprets deployment gaps. Section **5** restates corpus-level limitations (search lock **2026-05-18**, no forward citation chasing). Primary performance tables **exclude** `is_review: true` rows unless explicitly labelled as review evidence.

## 3. Results

Primary evidence below excludes standalone reviews unless noted. Metrics are quoted as in sources; see Section 2.4 and [metrics-legend.md](../knowledge/metrics-legend.md).

### 3.1 Microscopy and RGB vision

This cluster covers **visible or optical imaging** pipelines—microscopy, microfluidics, drone/aerial RGB, underwater object detection, and edge cameras—not vibrational spectroscopy (Section 3.2, spine 8.8).

#### 3.1.1 Microscopy and lab optical imaging

| `paper_id` | Year | Setting | Model / pipeline | Reported metrics | MP vs macro | LATAM / edge |
| --- | ---: | --- | --- | --- | --- | --- |
| W3003736709 | 2020 | Beach sediment lab | Sauvola + CNN features | Count error 1.4%; class error &lt;4% | MP (1–5 mm), not in-situ water | — |
| W4282979647 | 2022 | Clam digestate lab | U-Net / FCN / DeepLab | mF1 0.736; mIoU 0.617; mRecall 0.883 (best variants) | Stained MPs, bivalve matrix | — |
| W4391319604 | 2024 | Flow microfluidic lab | Polarization holographic + classifier | Accuracy up to 96%; Bland-Altman bias SD 0.05935 | MPs; authors claim portable hardware | edge_low_cost |
| W4383534319 | 2023 | Seawater surface + lab | CNN / ResNet34 / SVM / RF | Accuracy &gt;93%; AUC 0.98±0.02 | Small MPs (~10–45 µm); field demo **n=5** | — |
| W4409162823 | 2025 | Consumer product (abstract) | YOLOv5 + phone microscope | unverified (YOLO splits cited) | Not aquatic field | edge (abstract) |

**Synthesis.** Obtained full texts show strong **segmentation or classification metrics in controlled matrices** (sediment, digestate, flow cell) but weak **in-situ aquatic** replication. MP-Net (W4282979647) illustrates architecture trade-offs: highest **mRecall** (0.883) does not coincide with best **mF1** (0.736 on U-Net4). Polarization holographic imaging (W4391319604) is the only microscopy-class row with both obtained FT and an **edge/low-cost** flag, yet still depends on specialised optics and lab flow-through hardware.

W4385411640 provides **harmonised particle metrology** (maximum Feret diameter, shape descriptors) for MP imaging without training a detector—relevant when comparing CV outputs across studies, not a detection benchmark.

#### 3.1.2 RGB object detection and aerial vision

| `paper_id` | Year | Setting | Model | Reported metrics | MP vs macro | GS field |
| --- | ---: | --- | --- | --- | --- | --- |
| W4291123479 | 2022 | India urban river | Custom YOLO | mAP 89%; F1 0.8; recall 86% | **Macro solid waste** | yes (India) |
| W3091414454 | 2020 | Cambodia drone | PLD-CNN / PLQ-CNN | Accuracy up to 83%; PLQ 60–71% | **Macro litter** orthomosaics | yes (Cambodia) |
| W4391755619 | 2024 | Thailand coastal UV | Faster R-CNN-FPN | Precision 85.5–87.8%; mAP 33.9–35.7% | **MP** polymer boxes (field UV) | yes (Thailand) |
| W4321194910 | 2023 | Underwater / AUV | EfficientDet | ΔAP +1.2–2.6% vs baseline | **Macro marine debris** | — |
| W3204790372 | 2021 | Underwater marine | Mask R-CNN + attention | +9.6% mAP; +5.0% seg vs baseline (unverified wrap) | **Macro garbage** | yes (China); embeddable |
| W4385454320 | 2023 | Underwater litter (abstract) | YOLACT / Mask R-CNN | unverified | Macro litter | — |
| W4400418758 | 2024 | Lab flume (abstract) | YOLOv5 + DeepSORT | unverified | MP motion (flume) | edge (abstract) |

**Synthesis.** RGB vision papers dominate the **forwarded CV subset** (five `rgb_object_detection*` modality groups in [modality-map.md](../knowledge/modality-map.md)) but most target **litter or debris management**, not polymer identification of microplastics in water. The clearest **Global South field** vision evidence is riverine YOLO in India (W4291123479), drone litter mapping in Cambodia (W3091414454), UV-excited Faster R-CNN on coastal MPs in Thailand (W4391755619), and embeddable underwater Mask R-CNN in China (W3204790372).

W4391755619 is the standout **field MP-oriented** detector but exposes metric tension: **high precision** on classified boxes alongside **low mAP** on small objects—readers must not cite precision alone as “detection accuracy” (contradictions §1).

#### 3.1.3 Deployment readout (microscopy + RGB)

| Theme | Finding | Illustrative `paper_id` |
| --- | --- | --- |
| Aquatic field MP CV rare | Only Thailand UV Faster R-CNN combines field coastal imagery with MP labels among obtained FT | W4391755619 |
| Macro-litter proxy common | River YOLO, Cambodia drone, satellite-adjacent underwater papers measure litter/debris | W4291123479, W3091414454, W4321194910 |
| Low-cost claims thinly verified | edge_low_cost on four extraction rows; one microscopy FT (W4391319604); rest abstract or macro-litter | W4391319604, W4400418758, W3204790372 |
| Small field validation | Microfluidic seawater demo n=5 despite &gt;93% lab accuracy | W4383534319 |

For resource-limited monitoring, microscopy/RGB evidence supports **tier-1 screening** (hotspots, litter accumulations) more than **standalone MP polymer quantification** without spectroscopic confirmation (Section 4, Phase 8.12).

### 3.2 Spectroscopy and hyperspectral imaging (Raman, FTIR, HS)

Vibrational and spectral imaging dominate **obtained-full-text** primary evidence for **polymer-level** microplastic identification. RGB vision (Section 3.1) rarely achieves polymer ID without a spectroscopic step. Reviews stress Raman/FTIR maturity and cost (W4296114416, W4318615471) but are not counted as primary benchmarks here.

#### 3.2.1 Raman and surface-enhanced Raman

| `paper_id` | Year | Matrix | Model | Reported metrics | Notes |
| --- | ---: | --- | --- | --- | --- |
| W4362015000 | 2023 | Weathered SloPP-E lab | Subspace KNN ensemble | Accuracy 93.81% (22 polymers) | 97-sample test; lab Raman |
| W4366815281 | 2023 | German wastewater catchment | Deep learning (µ-Raman) | Precision ≥97.1%; recall ≥99.4% (5 polymers) | 64k spectra; polymers in 10.7% of 47 samples |
| W4382931577 | 2023 | Spiked tap water / lab | Random forest | Accuracy 98.8%; sens 98.5%; spec 100% | 24 nanoplastic types; env. validation unverified |
| W3172017684 | 2021 | Water monitoring prototype | CNN on deep UV Raman | CNN ~97% (multi-analyte) | MP pathway cited; prototype capex |
| W4304690559 | 2022 | Pure water bench | SERS substrates (no ML) | LOD 0.6 M aggregation (PS); PE weak | Bench only; substrate fabrication |
| W4404688861 | 2024 | Ocean in-situ (abstract) | Raman + ML preliminary | unverified | Not deployed; FT not obtained |
| W4392657594 | 2024 | Laboratory (abstract) | Raman ML noise study | unverified | **Colombia affiliation**; FT not obtained |

**Synthesis.** Raman+ML papers report **>93–99%** class-level metrics on controlled or environmental extracts (W4362015000, W4366815281, W4382931577) but constrain polymer coverage (five classes in W4366815281) or remain **lab/spiked** (W4382931577). The deep UV Raman CNN prototype (W3172017684) targets multi-analyte water monitoring with **unverified MP-specific performance**.

Included full text **W4308496878** (O-PTIR + simultaneous Raman, not in `extraction.csv`) compares instrument reproducibility and cites **1.4%** success for unaided visual screening versus Raman—supporting spectroscopic confirmation over vision-only ID in synthesis (questions.md Q7).

High-impact forward **W2952839204** (Mediterranean FTIR+ML, 179 citations) remains paywalled and is absent from verified metrics.

#### 3.2.2 FTIR and vibrational ML

| `paper_id` | Year | Matrix | Model | Reported metrics | GS / LATAM |
| --- | ---: | --- | --- | --- | --- |
| W4200249418 | 2021 | Water, sediment, sludge, soil, air, sea salt | RDF on μFTIR imaging | Accuracy 0.9766; κ 0.9690; PP sens 0.957; PVC sens 1.000 | — |
| W4408550134 | 2025 | Six-polymer lab (abstract) | k-NN, SVM, RF, CNN, MLP | unverified | Mexico (abstract) |
| W3196128465 | 2021 | Ocean MPs (abstract) | FTIR/Raman ML (title) | unverified | Brazil UFSC (abstract) |

**Synthesis.** W4200249418 is the strongest **multi-matrix μFTIR imaging + ML** evidence in the corpus (>20 polymer classes, eight environmental matrices) but requires **FPA-μFTIR**, expert dual-control, and commercial software—automation is ML-assisted, not field autonomous. Abstract-only LATAM FTIR rows (W4408550134, W3196128465) cannot support quantitative comparison until full text is retrieved.

#### 3.2.3 Hyperspectral imaging and satellite remote sensing

| `paper_id` | Year | Platform | Model | Reported metrics | MP vs macro |
| --- | ---: | --- | --- | --- | --- |
| W4213300830 | 2022 | SWIR HSI, Po River | HI-PLS-DA | Concentration 1.89–8.22 particles/m³ (4 stations) | Freshwater MPs |
| W3155690422 | 2021 | PRISMA + Sentinel-2 pansharpen | Plastic indexes | ~86% accuracy; ~8% pixel coverage target | Marine litter simulation |
| W4380082849 | 2023 | Sentinel-2 multispectral | XGBoost | 98% / 83% / 75% by scenario | **Macro** floating debris |
| W4205835860 | 2022 | Sentinel-2 (MARIDA) | U-Net / RF | avg IoU 0.57; per-class IoU 0.02–1.0 | **Marine debris** benchmark |
| W4408220111 | 2025 | Brazil beach RS + Raman + ML (abstract) | RF / Gradient Boosting | 6–35 MPs/m² cited; accuracy unverified | **Brazil field** (abstract) |

**Synthesis.** **True MP mapping** in water at particle scale appears in **SWIR HSI + PLS-DA** on a European river (W4213300830) with capital equipment and unverified per-level class metrics in extraction. **Satellite** papers (W4380082849, W4205835860) offer scalable **debris** monitoring at 10 m resolution—not sub-mm MPs. Scenario labelling is mandatory: W4380082849 ranges from **98%** (scenario 1) to **83%** (plastic-pixel scenario 2).

Review W4318615471 (Malaysia affiliation) aggregates HSI waste-sorting literature and cites **&lt;80%** precision below sub-mm MPs in prior work—relevant to deployment expectations, not primary counts.

#### 3.2.4 Spectroscopy deployment readout

| Theme | Finding | `paper_id` |
| --- | --- | --- |
| Strongest polymer ID | μFTIR RDF and Raman RF/DL on environmental or multi-type particles | W4200249418, W4382931577, W4366815281 |
| In-situ gap | Prototypes and reviews forward; few deployed aquatic Raman+ML systems in obtained FT | W3172017684, W4404688861, W4396828529 (review, forward) |
| LATAM spectroscopy | Brazil beach RS+ML and Mexico/Brazil FTIR/Raman abstracts; Colombia Raman affiliation only | W4408220111, W4408550134, W3196128465, W4392657594 |
| Capex barrier | µ-Raman, μFTIR, HSI, O-PTIR exceed typical municipal LATAM budgets | W4366815281, W4200249418, W4213300830, W4308496878 |

Spectroscopy supplies **reference-grade identification** for monitoring programmes that can ship samples or fund shared laboratory hubs (W4200249418, W4366815281, W4382931577); it does not yet deliver the low-latency, in-field MP CV loop implied by many policy briefs (W4396828529).

### 3.3 Wastewater and treatment plants

Utility-scale **wastewater and sewage-sludge** pathways appear in this corpus mainly through **vibrational spectroscopy + ML** (W4200249418, W4366815281) and **reviews** of removal efficiency (W4296114416)—not through online computer-vision monitoring at treatment works. No obtained primary reports an ML/CV detector operating continuously on clarifier effluent or influent in a Global South plant (cf. W4210266455 review-level India context only).

#### 3.3.1 Primary evidence (treatment-adjacent matrices)

| `paper_id` | Year | Matrix | Method | Reported metrics | Plant / field context |
| --- | ---: | --- | --- | --- | --- |
| W4366815281 | 2023 | Environmental samples (catchment / **wastewater**-linked) | µ-Raman + deep learning | Precision ≥97.1%; recall ≥99.4% (PE, PP, PS, PVC, PET) | 47 German catchment samples; polymers in 10.7% of samples |
| W4200249418 | 2021 | Water, sediment, soil, compost, **sewage sludge**, air, sea salt | μFTIR imaging + RDF | Accuracy 0.9766; κ 0.9690 | Lab FPA-μFTIR; eight matrices incl. sludge |

**Synthesis.** Both primaries are **ex-situ laboratory identification** on collected fractions, not in-pipe vision or spectroscopy. W4366815281 demonstrates that environmental (including wastewater-catchment) samples can feed high-accuracy **five-polymer** Raman models, but instrument cost and class coverage limit utility adoption. W4200249418 explicitly includes **sewage sludge** among validated μFTIR+RDF matrices, making it the strongest sludge-relevant ML evidence in the map—still dependent on expert dual-control (W4200249418).

#### 3.3.2 Reviews: removal, textiles, and aged particles

| `paper_id` | Type | Wastewater-relevant takeaway | Role in map |
| --- | --- | --- | --- |
| W4296114416 | Systematic review (Raman in water bodies) | Cites **WWTP MP removal 1.8–54.5%** by treatment level; advanced treatment ~88.6%; pretreatment ~78±8% | Gap context only (low confidence aggregates) |
| W4382940669 | Narrative review (textile washing) | **Upstream** microfiber release; machine vision **prospective**, not aquatic monitoring | Prevention vs detection boundary |
| W4210266455 | Comprehensive MNP review | Water/food/**wastewater pathways**; India (Hyderabad) authorship; no unified aquatic ML benchmark | GS affiliation; landscape only |
| W4393943493 | Aged-MP review | Cites wastewater/sludge matrices in literature; weathering breaks library-match assumptions | Informs sludge/spectroscopy training gaps |

**Synthesis.** Review evidence (W4296114416) highlights **enormous heterogeneity** in how much MP is removed by conventional versus advanced treatment—policy cannot rely on a single removal percentage. Textile-washing reviews (W4382940669) address **source reduction** before MPs reach plants, complementing but not replacing detection-focused RQ1–R4.

#### 3.3.3 Hyperspectral shape taxonomy (forward, wastewater matrices)

Forward papers on HSI **shape harmonisation** include **wastewater influent, effluent, and sludge** among seven environmental compartments (W4385737119, abstract; 11,042 particles). Deep-learning shape classifiers (W4414305742, abstract) target the same imaging stack. Neither provides obtained-FT **treatment-plant deployment** metrics in this corpus—they signal research interest in **sludge and effluent imaging**, not operational WWTP CV.

#### 3.3.4 Gap summary for utilities (LATAM-relevant)

| Gap | Evidence |
| --- | --- |
| No online ML/CV at WWTP in obtained FT | Absence among 31 included papers; in-situ review forward only (W4396828529) |
| Spectroscopy hub model vs pipe sensor | Primaries W4366815281, W4200249418 require lab µ-Raman / μFTIR |
| Removal ≠ detection | W4296114416 removal stats do not substitute for discharge monitoring ML |
| Global South plant studies | W4210266455 (India) review-level; no Colombia/Brazil/Mexico **plant** primary |

Monitoring programmes in resource-limited settings may need **tiered design**: textile/source controls (W4382940669), periodic **sludge/effluent laboratory spectroscopy** (W4200249418, W4366815281), and separate **receiving-water** CV where macro litter proxies are acceptable (Section 3.1)—not a single end-to-end WWTP vision model supported by current evidence.

### 3.4 Field versus laboratory performance gap

Across modalities, **laboratory and ex-situ workflows** report the highest classification metrics, while **field aquatic** evidence is thinner, often **macro-litter proxy** or **small-n** validation. This pattern drives transfer risk for operational monitoring (GAP-FIELD-01, GAP-FIELD-02 in [gap-list.md](../knowledge/gap-list.md)).

#### 3.4.1 Laboratory and ex-situ primaries (high reported metrics)

| `paper_id` | Modality | Matrix | Metric highlight | Aquatic in-situ? |
| --- | --- | --- | --- | --- |
| W3003736709 | Microscopy | Beach sediment lab | Count error 1.4% | No |
| W4282979647 | Fluorescence microscopy | Clam digestate | mF1 0.736 | No (bivalve lab) |
| W4200249418 | μFTIR imaging | Multi-matrix incl. water, sludge | Accuracy 0.9766 | Lab only |
| W4366815281 | µ-Raman | Catchment / wastewater samples | Precision ≥97.1% | Ex-situ environmental |
| W4382931577 | Raman | Spiked tap water / lab | Accuracy 98.8% | Spiked/lab |
| W4362015000 | Raman | Weathered SloPP-E | Accuracy 93.81% | Weathered lab standards |
| W4391319604 | Polarization holographic | Flow microfluidic lab | Accuracy up to 96% | Lab flow cell |
| W4383534319 | Microfluidic optical | Seawater (lab + tiny field) | Accuracy &gt;93%; field **n=5** | Partial |

**Synthesis.** Spectroscopy and microscopy clusters consistently achieve **&gt;93–98%** class-level performance on **prepared samples** (W4200249418, W4382931577, W4366815281). Even “environmental” labels (W4366815281) denote **brought-to-lab** fractions, not continuous in-stream sensing.

#### 3.4.2 Field, coastal, or in-situ-oriented evidence

| `paper_id` | Setting | What was measured | Metric highlight | MP-specific? |
| --- | --- | --- | --- | --- |
| W4391755619 | Thailand coastal UV imagery | MP polymer boxes | Precision 85–88%; **mAP 34–36%** | Yes (field MP) |
| W4291123479 | India urban river | Macro floating waste | mAP 89% | No (solid waste) |
| W3091414454 | Cambodia drone | Macro litter orthomosaics | Accuracy up to 83% | No |
| W3204790372 | China underwater | Macro garbage | +9.6% mAP vs baseline (unverified) | No |
| W4321194910 | Underwater / AUV | Marine debris | ΔAP +1–3% | No |
| W4213300830 | Po River | Freshwater MPs (HSI) | 1.89–8.22 particles/m³ | Yes (river MPs) |
| W3172017684 | Monitoring prototype | Multi-analyte water | CNN ~97% | MP unverified |
| W4404688861 | Ocean in-situ (abstract) | Raman+ML prototype | unverified | Claimed in-situ |

**Synthesis.** The only obtained-FT **field MP vision** pipeline with polymer-level intent is W4391755619—and it exposes the **precision versus mAP split** (Section 2.4). River and drone CV papers achieve strong headline accuracy on **litter**, not MPs (W4291123479, W3091414454). Po River HSI (W4213300830) maps **concentration** in a European river but with heavy HSI capital cost and lab-style campaigns.

#### 3.4.3 In-situ and deployment gap (reviews + forwards)

| Source | Claim | `paper_id` |
| --- | --- | --- |
| In-situ aquatic MP systematic review (forward) | Lab methods lack spatiotemporal coverage; in-situ tech needed | W4396828529 |
| ML+MP detection review (forward) | ML+spectroscopy to address imaging bottlenecks | W4409887007 |
| Deep UV Raman prototype | Real-time monitoring aspiration; MP subset unverified | W3172017684 |
| Microfluidic field demo | Strong lab metric; **five** field particles | W4383534319 |
| AI camera flume (abstract) | Real-time MP motion; not open water | W4400418758 |

Reviews (W4396828529) and forwards describe an **in-situ turn** in the literature that **outpaces** verified open full-text deployments in this corpus—**31** included papers remain overwhelmingly **lab or ex-situ** (GAP-FIELD-01).

#### 3.4.4 Implications for performance comparison

1. **Do not rank** lab Raman/FTIR accuracy against river YOLO mAP—they measure different tasks ([metrics-legend.md](../knowledge/metrics-legend.md)).  
2. **Field transfer** of lab metrics requires explicit replication on turbid matrices, weathered particles (W4393943493), and operational particle size distributions.  
3. **Pilot programmes** should report **both** detection rate (recall/mAP) and positive predictive value (precision) when citing field CV (W4391755619).  
4. **Global South field vision** exists (India, Thailand, Cambodia, China) but rarely for **MP-specific** aquatic monitoring; spectroscopy LATAM evidence is mostly abstract or affiliation-level (Section 3.2, Phase 9).

### 3.5 Cross-study metrics comparison

Reported performance in this map spans **seven task families** (spectral polymer ID, segmentation, object detection, relative gains, scene classification, counting, and instrument agreement) that must not be merged into a single leaderboard ([metrics-legend.md](../knowledge/metrics-legend.md)). The tables below group **primary extractions with obtained full text** where possible; review rows and `unverified` values are excluded from ranking but noted where they shape literature narratives.

#### 3.5.1 Spectral and vibrational ML — polymer identification (Family A)

| `paper_id` | Matrix | Classes / scale | Metric(s) | n / split | Lab / field |
| --- | --- | --- | --- | --- | --- |
| W4200249418 | Water, sediment, sludge, WWTP (μFTIR FPA) | &gt;20 polymers | Accuracy 0.9766; κ 0.9690; per-polymer sensitivity 0.957–1.000 | Monte Carlo CV; expert dual-control | Lab |
| W4366815281 | Catchment / wastewater (µ-Raman) | 5 polymers (PE, PP, PS, PVC, PET) | Precision ≥97.1%; recall ≥99.4% | 47 samples; polymers in 10.7% of samples | Ex-situ env. |
| W4382931577 | Spiked tap water; nanoplastics | 24 types | Accuracy 98.8%; sensitivity 98.5%; specificity 100% | Lab spiked | Lab |
| W4362015000 | Weathered SloPP-E standards | 22 polymers | Accuracy 93.81% (subspace KNN ensemble) | 97 SloPP-E test particles | Lab |
| W4383534319 | Seawater microfluidic | Small MPs (~10–45 µm) | Accuracy &gt;93%; AUC 0.98±0.02 | Field demo **n=5** trapped | Lab + tiny field |
| W3172017684 | Multi-analyte water prototype | MP subset **unverified** | CNN ~97% | Prototype; metrics unverified in extraction | Prototype |

**Narrative.** Spectral primaries cluster at **93–99%** class-level accuracy or precision on **defined polymer sets** (W4200249418, W4382931577, W4366815281). High numbers do not imply high **prevalence** in environmental matrices—W4366815281 reports polymer hits in only **10.7%** of samples. W4200249418 retains **human expert dual-control** despite strong RDF metrics, so “accuracy” here means **ML-assisted lab ID**, not unattended field classification.

#### 3.5.2 Vision detection and segmentation (Families B–E)

| `paper_id` | Task | MP vs macro | Metric(s) | Setting |
| --- | --- | --- | --- | --- |
| W4282979647 | Fluorescence MP segmentation | MP (lab stained) | mF1 0.736; mIoU 0.617; mRecall 0.883 (best arch. differs) | Lab microscopy |
| W4291123479 | Riverine YOLO detection | **Macro** solid waste | mAP 89%; F1 0.8; recall 86% | India urban river |
| W4391755619 | UV Faster R-CNN MP boxes | **MP** polymers | Precision 85.5–87.8%; **mAP 33.9–35.7%** | Thailand coastal field |
| W4321194910 | AUV marine debris | Macro debris | ΔAP +1.2–2.6% vs baseline | Underwater |
| W3204790372 | Underwater garbage OD | Macro | +9.6% mAP; +5% seg (**unverified**) | China |
| W4380082849 | Sentinel-2 pixel/scene class | Macro floating debris | **98%** (scenario 1); **83%** (scenario 2 plastic pixels); **75%** (ensemble quant.) | Satellite |
| W4205835860 | Sentinel-2 debris segmentation | Macro | Mean IoU **0.57**; per-class IoU **0.02–1.0** | Satellite |
| W3155690422 | PRISMA+S2 pansharpened | Down to ~8% pixel coverage | ~86% accuracy (**unverified** in extraction) | Satellite |
| W3091414454 | Drone litter CNN | Macro litter | Accuracy up to **83%**; PLQ 60–71% | Cambodia |

**Narrative.** The **largest headline mAP** in obtained FT (89%, W4291123479) applies to **macro riverine waste**, not MPs—a central incomparability tension (contradictions §2). The only field **MP object-detection** primary (W4391755619) shows why **precision alone is misleading**: high precision on proposed boxes coexists with **low mAP** because many true small objects are missed. Satellite papers (W4380082849, W4205835860) report strong pixel- or mask-level scores at **10 m** resolution—useful for **litter management**, not sub-millimetre MP quantification.

#### 3.5.3 Counting, throughput, and concentration (Family F)

| `paper_id` | Unit | Reference / method | Metric(s) | Notes |
| --- | --- | --- | --- | --- |
| W3003736709 | Particle counts | Manual vs SMACC | Count error **1.4%**; class error &lt;4% (except line class) | Lab beach sediment; 2507 particles |
| W4391319604 | Flow-through counts | Bland-Altman vs reference | Accuracy up to **96%**; bias SD 0.05935 | Polarization holographic; edge/low-cost flagged |
| W4213300830 | Concentration | HSI + HI-PLS-DA | **1.89–8.22** particles/m³ (Po River) | Per-level metrics **unverified** in extraction |
| W4383534319 | Trapped particles | CNN vs SVM | See §3.5.1; field **n=5** | Throughput claim limited by field n |

**Narrative.** Counting metrics (W3003736709, W4391319604) answer **how many** or **how biased** relative to a reference—they do not substitute for polymer ID. W4213300830 reports **environmental concentration** with expensive HSI infrastructure, bridging quantification and capital-cost barriers (Section 3.2).

#### 3.5.4 Cross-family tensions (what not to compare)

| Tension | Example `paper_id` | Correct reporting |
| --- | --- | --- |
| Precision vs mAP on same detector | W4391755619 | Report **both**; precision is conditional on detections |
| Scenario-dependent accuracy | W4380082849 | Label scenario (debris vs plastic pixels vs ensemble) |
| Architecture optimises different segmentation goals | W4282979647 | mRecall-best ≠ mF1-best architecture |
| Mean IoU hides class failure | W4205835860 | Report per-class spread (0.02–1.0) |
| Spectral accuracy vs YOLO mAP | W4366815281 vs W4291123479 | **Do not rank**—different families |
| Review aggregate vs primary table | W4393943493, W4318615471 | Secondary; trace to primary if cited |
| `unverified` extraction | W3204790372, W3155690422 | Do not use in performance league tables |

These tensions are documented in [contradictions.md](../knowledge/contradictions.md) §1, §7 and informed Phase 7 claims on metric hygiene.

#### 3.5.5 Implications for synthesis and monitoring design

1. **Publish three metric tables** in any comparative SLR—spectral ID, vision detection/segmentation, counting—never a single “accuracy” column (W4200249418 vs W4391755619 vs W3003736709).  
2. **Require scenario labels** when one paper reports multiple accuracies (W4380082849).  
3. **For field MP CV**, treat **mAP/recall** as operational detection rate and **precision** as positive-predictive value among alerts (W4391755619).  
4. **For spectroscopy pilots**, pair reported **classifier metrics** with **sample hit rate** and instrument capex (W4366815281, W4200249418).  
5. **Abstract-only and review-cited percentages** remain hypotheses until full-text verification—**8** of **37** extraction rows are abstract-only in this corpus (e.g. W4409887007, W4400418758 forwards).

Results §3.1–3.5 collectively show that **high metrics are modality-specific and matrix-specific** (W4366815281 lab Raman vs W4391755619 field UV vs W4291123479 macro YOLO); transfer to resource-limited aquatic monitoring requires explicit task definition before any number is quoted in policy or procurement documents.

## 4. Discussion

Results §3.1–3.5 show strong **laboratory** performance for spectroscopy and microscopy ML (W4200249418, W4282979647), thinner **field aquatic** evidence (W4391755619, W4383534319), and frequent **task mismatch** between macro-litter computer vision (W4291123479, W4380082849) and microplastic-specific monitoring. This section interprets **deployment barriers** relevant to resource-limited agencies, municipal utilities, and Latin American contexts—without treating literature metrics as procurement guarantees.

### 4.1 Deployment barriers

#### 4.1.1 Capital expenditure and institutional capacity

Validated microplastic identification in this corpus overwhelmingly assumes **institutional spectroscopy or imaging platforms**: μFTIR random-decision forests on FPA imaging (W4200249418), µ-Raman deep learning on catchment and wastewater fractions (W4366815281), bench Raman for nanoplastics (W4382931577), and hyperspectral river campaigns (W4213300830). Reviews of water-body remote sensing and spectroscopy stress **equipment cost, operator training, and maintenance** as adoption limits (W4296114416, W4318615471). O-PTIR and related high-resolution vibrational workflows (W4308496878) further raise the capex floor relative to municipal water-laboratory budgets.

**Barrier:** procurement based on “near-100% accuracy” abstracts (W4393943493 review aggregates) without line-item **instrument, consumable, and expert-labor** costs misaligns with LATAM laboratory reality (GAP-EDGE-02). Phone-microscope YOLO studies (W4409162823, forward) target **consumer lab** matrices, not continuous river sensing.

#### 4.1.2 In-situ sensing versus ex-situ laboratory hubs

Systematic reviews argue that conventional extraction-based methods lack **spatiotemporal coverage** and that **in-situ** aquatic technologies are needed (W4396828529). The map’s **31** obtained full texts remain dominated by **brought-to-lab** workflows—even “environmental” µ-Raman (W4366815281) and multi-matrix μFTIR (W4200249418). Prototypes point toward in-stream monitoring (deep UV Raman CNN, W3172017684; ocean Raman+ML, W4404688861) but **MP-specific performance is unverified** or paywalled in extraction.

**Barrier:** agencies cannot deploy a single “sensor in the pipe” product supported by open, MP-validated full-text evidence in this corpus (GAP-FIELD-01; W4396828529, W3172017684, W4404688861). Operational models likely require **periodic laboratory hubs** (W4200249418, W4366815281) plus occasional field campaigns—not continuous autonomous MP ID.

#### 4.1.3 Field replication, edge hardware, and “low-cost” framing

Promising lab pipelines lack proportional **field evidence**. Microfluidic seawater CNNs exceed 93% accuracy in the laboratory but report only **five** field-trapped particles (W4383534319). Polarization holographic flow-through imaging (W4391319604)—the only **edge_low_cost** primary with obtained full text—still depends on **specialized optics** and microfluidic fabrication; author “portable/cost-effective” framing does not equal municipal affordability without supply-chain analysis.

Among **four** extraction rows flagged edge/low-cost, **three** lack obtained full text (W4400418758 AI camera flume; W3204790372 underwater macro litter; partial verification on W4391755619 field UV stack) (GAP-EDGE-01). Real-time MP camera abstracts (W4400418758) cannot inform procurement until metrics are verified.

**Barrier:** “Edge AI” marketing (W4400418758, W4391755619) outruns **open, replicated field trials** with reported n, matrices, and maintenance plans (W4383534319 field **n=5**; W4391319604 only verified edge primary).

#### 4.1.4 Task mismatch: macro litter, satellite debris, and MP policy goals

Riverine YOLO at **89% mAP** (W4291123479), drone litter CNNs up to **83%** accuracy (W3091414454), and Sentinel-2 debris classifiers at **98%** in scenario 1 (W4380082849) address **visible litter or debris**, not polymer-level microplastics in water (GAP-SCALE-01). Transferring these metrics to MP regulation risks **false confidence** in compliance monitoring.

The sole obtained-FT **field MP vision** primary (W4391755619) demonstrates a second barrier: **high precision with low mAP**—suitable for targeted audits if operators understand alert bias, unsuitable as sole compliance sensor without recall reporting (Section 3.5).

**Barrier:** monitoring programmes must **separate litter-management CV** (W4291123479, W3091414454, W4380082849) from **MP quantification/ID** budgets and KPIs (W4391755619, W4366815281).

#### 4.1.5 Sample preparation, standardisation, and human-in-the-loop

High μFTIR accuracy (0.9766, W4200249418) coexists with **expert dual-control** and commercial spectral software—ML assists, not replaces, analysts. SMACC microscopy counting (W3003736709) requires **cleaned lab sediment**; weathered-particle Raman ML (W4362015000) uses **SloPP-E standards** that may not represent aged environmental particles (W4393943493). Bibliometric reviews highlight **manual annotation bottlenecks** (W4313826580) while extraction-heavy matrices persist in spectroscopy primaries.

**Barrier:** hidden **digestion, filtration, staining, and substrate** steps dominate wall-clock time and skill requirements—automation claims in reviews (W4404459247 forward; W4313826580) are **matrix-dependent**, not universal for turbid tropical rivers or complex sludge (W4200249418, W4366815281).

#### 4.1.6 Geographic evidence and LATAM transfer risk

Global South **field** vision studies appear in India (W4291123479), Thailand (W4391755619), Cambodia (W3091414454), and China (W3204790372), but **no obtained primary** documents a **Colombia in-country field programme** for aquatic ML/CV microplastics (W4392657594 affiliation only; GAP-GEO-01). Brazil urban-beach remote sensing + ML (W4408220111) remains **abstract-only** in this corpus (GAP-GEO-03).

**Barrier:** Latin American agencies lack **locally validated** error rates for turbidity, biofouling, and polymer mixtures (W4392657594 affiliation only; W4408220111 abstract-only Brazil beach); extrapolation from South/Southeast Asian rivers (W4291123479, W4391755619) requires explicit **environmental similarity** arguments, not citation of headline accuracy alone.

#### 4.1.7 Evidence access and benchmark gaps for procurement

**85** forwarded papers lacked retrievable open full text under project rules (manifest; GAP-CORPUS-01), including recent ML+MP detection reviews (W4409887007). Unlike marine debris (**MARIDA**, W4205835860), there is **no shared aquatic MP CV benchmark** spanning labs and Global South sites (GAP-METHOD-01). **Eight** extraction rows remain abstract-only with `unverified` metrics (GAP-METHOD-03).

**Barrier:** tenders referencing “state of the art” without **open datasets** (no aquatic MP CV benchmark vs W4205835860 MARIDA debris), **verified metrics**, and **maintenance contracts** cannot be defended from this evidence base alone (W4409887007 forward).

#### 4.1.8 Toward tiered monitoring (synthesis, not a single product)

Evidence supports **layered** designs rather than one ML box (examples in table—W4382940669 through W3172017684):

| Tier | Role | Corpus examples | Deployment note |
| --- | --- | --- | --- |
| **Source / prevention** | Reduce release before waterbody | Textile washing controls (W4382940669 review) | Policy lever; vision “prospective” only |
| **Surveillance (litter proxy)** | Macro floating waste, drones, satellite | W4291123479, W3091414454, W4380082849 | Useful for **litter** management; weak MP specificity |
| **Field MP screening** | UV or RGB alerts with explicit recall | W4391755619 | Report precision **and** mAP; small-object limits |
| **Laboratory confirmation** | Polymer ID on subsets | W4200249418, W4366815281, W4382931577 | Capex + skilled staff; WWTP sludge/effluent (Section 3.3) |
| **Research / pilot** | Microfluidics, holographic flow, in-situ Raman | W4383534319, W4391319604, W3172017684 | Requires scaled field n before scale-up |

Wastewater evidence (Section 3.3) reinforces that **removal statistics** (W4296114416) do not replace **discharge monitoring**—utilities still need matrix-appropriate ID tiers.

Section 6 develops **Colombia-, Magdalena-, and Caribbean-relevant** scope; the cross-cutting barrier here is that **literature performance ≠ deployable system** without capital, field replication, task clarity, and open verification.

### 4.2 Hype versus validated performance

“Hype” here means **claims that outrun what open, obtained full-text primaries demonstrate** for **aquatic microplastic monitoring** (e.g. W4291123479 cited as “MP” detection)—not that ML is useless. This map distinguishes **validated** evidence (obtained full text, metrics traceable to stated task/matrix, extraction audited or marked `unverified`) from **narrative amplification** in abstracts, reviews, and vendor-style framing.

#### 4.2.1 What counts as validated in this corpus

| Criterion | Validated example | Not validated (in corpus) |
| --- | --- | --- |
| Obtained OA full text + primary extraction | W4200249418 μFTIR RDF; W4391755619 field UV detector | W4409887007 ML+MP review (forward, no FT) |
| Metric tied to defined task/matrix | W4380082849 scenarios 1–3 labelled | “~97% CNN” on multi-analyte prototype without MP subset proof (W3172017684) |
| Field n and setting reported | W4391755619 Thailand coastal UV; **n** explicit in source | W4383534319 field **n=5** — validated but **not scalable** claim |
| MP-specific aquatic task | W4366815281 five-polymer µ-Raman on catchment samples | W4291123479 **89% mAP** on macro solid waste |
| Both salient detector metrics when applicable | Precision **and** mAP for W4391755619 | Precision **alone** as headline |

**31** included full texts and **28** primary extraction rows anchor “validated” discussion; **9** review extractions inform gaps only. **85** forwards without retrievable FT cannot support performance ranking (GAP-CORPUS-01).

#### 4.2.2 Misleading headline patterns (literature, not vendor names)

The following patterns appear in harvested papers and reviews; each has a **validated reframing** grounded in primaries ([metrics-legend.md](../knowledge/metrics-legend.md) §7):

| Hype pattern | Why it misleads | Validated reading | `paper_id` |
| --- | --- | --- | --- |
| “Detection accuracy up to **98%**” (satellite) | Pixel/scene debris ≠ MP in water | Scenario 1 floating debris; **83%** plastic pixels; **75%** ensemble quantification | W4380082849 |
| “**87%** precision” MP detection | Ignores missed small objects | **85.5–87.8%** precision with **mAP 33.9–35.7%** on same detector | W4391755619 |
| “State-of-the-art YOLO for microplastics” | Task is macro riverine waste | **89% mAP**, **86%** recall on **solid waste**, India urban river | W4291123479 |
| “Raman ML **near 100%** in the environment” | Classifier on **five polymers**; polymers in **10.7%** of samples | Precision ≥97.1%, recall ≥99.4% on hits; low prevalence | W4366815281 |
| “Real-time / in-situ monitoring solved” | Prototype or review aspiration | In-situ review need (W4396828529); ocean prototype **unverified** (W4404688861) | W4396828529, W4404688861 |
| “Low-cost edge AI for MPs” | **3/4** edge-flagged rows lack obtained FT | Only W4391319604 verified; specialized holographic hardware | GAP-EDGE-01 |
| “Fully automated pipeline” | Expert dual-control remains | μFTIR RDF accuracy 0.9766 **with** expert QA | W4200249418 |
| “MARIDA-level benchmark for MPs” | Marine **debris** segmentation; mean IoU **0.57**, class IoU **0.02–1.0** | Macro litter transfer only; not MP benchmark | W4205835860 |

Policy briefs and grant proposals should **copy the validated column**, not the left-hand hype pattern.

#### 4.2.3 Automation and “AI gap” narratives

Bibliometric and systematic reviews describe an **automation gap**: manual annotation, library matching, and standardisation burdens (W4313826580, W3134265767, W4396828529). That narrative is **directionally supported**—yet the strongest numeric claims in primaries still sit on **ML-assisted laboratory spectroscopy** with substantial human and prep steps (W4200249418, W4304690559).

Forward primers promise reduced extraction labor (W4404459247) while μFTIR and µ-Raman primaries document **digestion, filtration, substrates, and expert review** (W4200249418, W4366815281). **Hype** equates “deep learning inside the instrument pipeline” with “unattended field robot”; **validated** evidence supports **throughput gains on prepared fractions**, not elimination of environmental sampling.

Hyperspectral shape-taxonomy papers (W4385737119, W4414305742, forwards) claim **replacement of expert labeling** at 11,042 particles—valuable for **comparability**, but not yet linked to open **aquatic field** benchmarks in this corpus (GAP-SCALE-03).

#### 4.2.4 Review aggregates versus primary tables

Aged-microplastic reviews cite literature **μFTIR ML &gt;97%** aggregates (W4393943493). Plastic-waste HSI reviews cite **~88.6%** ResNet sorting accuracy but also **&lt;80%** precision below sub-millimetre in cited work (W4318615471). Raman water-body reviews aggregate **WWTP removal 1.8–54.5%** depending on treatment (W4296114416)—heterogeneous numbers unsuitable as a single “technology works” headline.

**Rule for this map:** reviews **motivate** gaps and cite directions; **primaries with obtained FT** bound what we treat as **verified performance** for aquatic ML/CV MPs. Citing W4318615471’s 88.6% without the sub-mm caveat repeats hype; pairing both sentences is validated synthesis.

#### 4.2.5 Relative gains and incremental CV papers

Papers reporting **+9.6% mAP** (W3204790372, unverified) or **+1.2–2.6% ΔAP** (W4321194910) demonstrate **architecture tweaks** on marine-debris tasks without absolute operational mAP in extraction. **Hype:** “significant improvement” in abstracts. **Validated:** incremental gain **on a stated baseline** for **macro debris**—useful for methods papers, weak for procurement unless baseline absolute performance on the target matrix is published and verified.

#### 4.2.6 What validated performance actually supports (2019–2025 map)

Synthesising §3–4.1, **defensible** statements for resource-limited aquatic monitoring (anchored to W4200249418, W4391755619, W4291123479, W4396828529) include:

1. **Lab spectroscopy + ML** delivers high polymer-class accuracy on **defined** environmental fractions when samples reach a hub (W4200249418, W4366815281, W4382931577)—with capex and prep costs explicit.  
2. **Microscopy / flow imaging** can automate counting or classification on **controlled** matrices (W3003736709, W4391319604, W4282979647) with matrix limits stated.  
3. **Field MP vision** is **emerging** with one obtained-FT polymer-box detector showing **usable precision but poor holistic detection** (W4391755619)—report both metrics.  
4. **Macro litter CV** in Global South rivers and coasts is **real but non-substitutable** for MP policy (W4291123479, W3091414454, W3204790372).  
5. **Satellite and AUV debris** models are **validated for litter/debris surveillance**, not sub-mm MPs (W4380082849, W4205835860, W4321194910).  
6. **In-situ aquatic MP sensing** remains **review-forward and prototype-heavy** relative to lab spectroscopy (W4396828529, W3172017684, W4404688861).

Statements beyond this list—e.g. “deployable nationwide AI microplastic network,” “$10 phone microscope for river compliance,” “real-time plant-edge MP sensor”—require **new field trials** not evidenced in open full texts here (W4409162823, W4400418758 forwards).

#### 4.2.7 Checklist for readers and funders

Before treating a number as **validated** for tenders or regulation:

- [ ] Obtained full text reviewed (not abstract-only)?  
- [ ] Task family identified (spectral ID vs detection vs counting)?  
- [ ] MP vs macro litter explicit?  
- [ ] Lab vs field matrix explicit?  
- [ ] All salient metrics reported (precision **and** mAP when both exist)?  
- [ ] Sample **n** and class prevalence stated (W4366815281)?  
- [ ] Scenario label if multiple accuracies (W4380082849)?  
- [ ] Human-in-the-loop and instrument cost acknowledged?  
- [ ] Local geographic validation (LATAM field site) or extrapolation justified?

Until these checks pass, treat performance claims as **hypotheses for pilot funding**, not as **proven monitoring performance**.

## 5. Limitations

This work is a **systematic map** (evidence charting and gap analysis), not a quantitative meta-analysis. Performance figures are **not pooled** across studies because metrics and tasks are incomparable (Section 3.5). Limitations follow the methods in Section 2 and the corpus state at search lock **2026-05-18** (`manifest.json`).

### 5.1 Search and identification

- **Single bibliographic source:** Records were harvested from **OpenAlex** only (five English keyword queries; `protocol.md`). Grey literature, regional databases (SciELO, Redalyc), patent corpora, and non-indexed municipal reports were **not** searched.  
- **Query language:** Strings are English-centric; relevant Spanish- or Portuguese-language aquatic MP ML studies may be under-represented despite Global South field sites in some primaries (W4291123479, W3091414454).  
- **Date window:** Publications from **2019-01-01** through lock; pre-2019 foundational HSI or Raman ML papers enter only via **review citations**, not systematic re-search.  
- **OpenAlex metadata quality:** Abstracts and type labels (`article` vs `review`) depend on source metadata; misclassified or thin abstracts can affect title–abstract screening consistency.

### 5.2 Screening

- **Single-reviewer operational model:** Title–abstract decisions were logged in `screening-log.csv` with protocol-aligned exclusion reasons, but this factory run did **not** implement independent dual screening or κ statistics.  
- **Borderline cases:** Papers on **marine litter**, **plastic waste sorting**, or **general water-quality ML** required judgement to separate aquatic MP detection from adjacent topics (screening rationale in `outputs/knowledge/screening-rationale.md`).  
- **Forward threshold:** **116** papers forwarded to full text (target ≥35 met); forward set is large relative to obtained FT, increasing **synthesis asymmetry** toward abstract-level knowledge for non-included papers.

### 5.3 Full-text access and included evidence

| Stage | n | Limitation |
| --- | ---: | --- |
| Identified | 228 | — |
| Forwarded to full text | 116 | — |
| Full text obtained | 31 | **73%** of forwards not retrieved under project OA/rules |
| Not retrieved | 85 | Paywall, licensing, or unavailable PDF/HTML |
| Included in summaries | 31 | All obtained FT summarised; no selective drop within obtained set |

- **Access policy:** Only **open-access or rights-clear** PDFs/HTML were stored (`corpus/raw/`). **25** OA PDFs on disk; **5** additional full texts as HTML/normalized text. Paywalled primaries (e.g. fish HSI rapid workflow W2936115560; central 2025 ML+MP review W4409887007) could not be verified.  
- **Included-set bias:** Synthesis of numeric performance **over-represents OA-friendly venues** and under-represents recent paywalled reviews that may summarise the field (GAP-CORPUS-01, GAP-CORPUS-02).  
- **No contact with authors:** Missing metrics were marked `unverified` rather than requested by email.

### 5.4 Extraction and structured data

- **Partial extraction coverage:** **37** extraction rows cover modality-prioritised papers, **not** all **31** included full texts—**W4308496878** (O-PTIR+Raman) has summary and claims but **no** `extraction.csv` row (questions.md Q7).  
- **Abstract-only extractions:** **8** rows lack obtained FT; metrics remain `unverified` and must not be ranked (GAP-METHOD-03).  
- **Metric transcription:** Performance strings were copied from full text or abstracts without independent replication of experiments.  
- **Global South flag:** **12** rows `global_south=yes` vs **four** GS field-site vision primaries—affiliation tagging can overstate deployment evidence (GAP-GEO-02).  
- **No risk-of-bias tool:** Extraction captured author-stated limits, not formal RoB2/QUADAS-style scores.

### 5.5 Synthesis, claims, and automation

- **Claims layer:** **61** statements in `claims.jsonl` with confidence tags; they guide but do not replace traceability to `paper_id` in this report.  
- **AI-assisted factory:** Corpus harvest, screening batches, summarisation, and draft `slr.md` sections were produced in an **agent-assisted Ralph loop** with human-defined protocol and validation script (`validate-corpus.sh`). Residual factual errors are possible; citation audit (spine 8.18) is the corrective control.  
- **Reviews in discussion:** **9** review extraction rows inform gaps; they were **excluded** from primary performance tables in Results to avoid double-counting literature aggregates (W4393943493, W4318615471).

### 5.6 Scope and generalisability

- **Aquatic emphasis:** Terrestrial-only sorting or indoor-air HSI shape papers (W4385737119, W4414305742 forwards) appear only where matrices include wastewater or environmental water fractions.  
- **MP vs macro litter:** Many high-performing CV studies address **visible litter or debris** (W4291123479, W4380082849); generalisation to **regulatory MP thresholds** is not supported.  
- **LATAM/Colombia:** No obtained primary documents a **Colombia in-country field** aquatic ML/CV MP programme (W4392657594 affiliation only). Section 6 discusses Magdalena/Caribbean scope; this map does **not** validate local error rates.  
- **Temporal drift:** Search lock **2026-05-18**; papers published after lock are out of scope.

### 5.7 Implications for use of this map

Readers should use this document to **prioritise gaps and architectures** (Sections 4.1–4.2, 6), not as a pooled effect-size summary. Procurement, regulation, and national monitoring design require **local pilot data**, open benchmarks (GAP-METHOD-01), and full-text verification of any cited metric. Follow-up work flagged in [questions.md](../../questions.md) includes extraction of W4308496878 and retrieval of W4409887007.

## 6. Regional focus: Colombia, Magdalena River, and Caribbean coast

This section states **honest scope** for readers using this map for **Colombian** or **Caribbean** aquatic monitoring—not an claim that the literature already validates Magdalena-basin or Caribbean-coast ML deployments. Detail is expanded in [latam-gap-analysis.md](./latam-gap-analysis.md) (Phase 9.2).

### 6.1 What this map does and does not contain

| Question | Answer in this corpus (search lock 2026-05-18) |
| --- | --- |
| Any obtained FT with **Magdalena River** sampling + ML/CV MPs? | **No** — no `paper_id` names Magdalena or Cauca confluence campaigns |
| Any obtained FT with **Colombian Caribbean** coast/estuary + ML/CV MPs? | **No** |
| Any Colombia-linked aquatic MP ML paper at all? | **One forward** Raman classification study with **CO author affiliation** (W4392657594); metrics `unverified`; **no field site** in available text |
| Closest LATAM **field** MP+ML signal | Brazil urban **sandy beaches** (São Paulo), RS+µ-Raman+ML (W4408220111, forward FT not obtained) |

OpenAlex harvest of **228** works and [`latam-papers.jsonl`](../corpus/structured/latam-papers.jsonl) (**25** LATAM-flagged rows) did not surface Spanish-primary regional grey literature; **SciELO/Redalyc were not searched** (Section 5.1). Absence here is **evidence of gap in this map**, not proof that no Colombian research exists globally.

### 6.2 Magdalena River basin — relevance without validation

The **Magdalena–Cauca** system is a plausible policy anchor for **freshwater** MP monitoring in Colombia (high discharge, urban/agricultural pressure, downstream Caribbean connection), but **no included study** reports ML/CV detection on Magdalena water, sediment, or effluent matrices.

**Indirectly relevant** obtained evidence (transfer only):

| Evidence type | `paper_id` | Why cited for Magdalena discussion | Limit for Magdalena |
| --- | --- | --- | --- |
| Freshwater river MP **concentration** (HSI, Europe) | W4213300830 | Shows riverine MP mapping workflow (particles/m³) | Po River; HSI capital; not ML/CV alert system |
| Tropical **urban river** solid-waste YOLO | W4291123479 | High mAP on floating waste in India | **Macro waste**, not MPs; hydraulic/turbidity regime differs |
| Catchment **µ-Raman** lab ID | W4366815281 | Environmental water/sludge fractions | Germany catchment; ex-situ hub |
| Wastewater **μFTIR** multi-matrix | W4200249418 | Influent/effluent/sludge in protocol | European lab workflow; no Magdalena utility tie-in |

A Magdalena programme would need **new** campaigns: defined size classes, wet-season/dry-season replication, turbidity strata, and polymer confirmation on a subset (W4200249418, W4366815281)—not citation of W4291123479 mAP alone.

### 6.3 Caribbean coast — relevance without validation

**Caribbean-facing** Colombian waters (e.g. deltas, ports, coral-influenced coasts) share challenges with other **tropical coastal** studies in the map—UV fouling, macro algae/detritus confusion, and **satellite resolution limits** for sub-mm MPs—but **no paper** reports Caribbean Colombia field validation.

| Evidence type | `paper_id` | Caribbean transfer note | Limit |
| --- | --- | --- | --- |
| Field **UV MP** polymer boxes | W4391755619 | Thailand coastal Faster R-CNN; precision/mAP split | Different coast; not Caribbean |
| Drone **macro litter** | W3091414454 | Cambodia orthomosaic litter CNN | Litter management proxy only |
| Sentinel-2 **floating debris** | W4380082849 | Scenario accuracies up to 98% / 83% | **10 m** pixels; macro debris |
| Marine debris benchmark | W4205835860 | MARIDA segmentation IoU spread | Atlantic/Mediterranean training context; not MP |
| Mediterranean HSI marine MPs (abstract forward) | W4403492962 | Regional HSI classification mention | Forward; not Caribbean Colombia |

Caribbean **citizen science / water monitoring** literature (e.g. W3092266267) addresses participation benefits, not MP ML performance—out of scope for method transfer tables.

### 6.4 Colombia affiliation versus national monitoring (GAP-GEO-01)

**W4392657594** (forward) studies high-frequency noise in **Raman ML** for polymer classification with **Colombian author affiliation**; it does **not** establish that Colombia operates a validated aquatic MP vision or spectroscopy network. Policy documents should not cite this row as “Colombian field evidence.”

**W2959012558** (forward, multi-country marine debris observing review) lists **CO** among author countries but is not an MP ML detection primary.

### 6.5 Practical implications for Magdalena/Caribbean programmes

1. **Do not** procure ML/CV MP systems citing Colombia-validated performance from this map—there is none in obtained full text.  
2. **Do** treat tiered architectures (Section 4.1.8) as **hypotheses**: macro litter/drone/satellite for spatial surveillance (W3091414454, W4380082849), periodic spectroscopic ID on subsets (W4200249418, W4366815281), optional UV screening if replicated locally (methods from W4391755619).  
3. **Prioritise** open retrieval of W4408220111 (Brazil beach RS+ML) and local **Magdalena/Caribbean pilots** with dual metrics (detection rate + polymer confirmation).  
4. **Supplement** this map with Spanish/Portuguese regional search before national regulation (Section 5.1).

[questions.md](../../questions.md) **Q9** (Colombia architecture without local validation) remains **open**; this section narrows it: **no Magdalena- or Caribbean-specific ML/CV MP validation** exists in the current corpus.

## 7. Low-cost and edge deployment readiness

“Low-cost” and “edge” labels in the literature rarely mean **affordable, maintainable aquatic MP monitoring** in resource-limited or LATAM municipal settings. This section rates **readiness** from obtained open full text and structured `edge_low_cost` flags in `extraction.csv` (Phase 6.9)—see GAP-EDGE-01, GAP-EDGE-02 in [gap-list.md](../knowledge/gap-list.md).

### 7.1 Corpus inventory of edge / low-cost claims

| `paper_id` | `edge_low_cost` | FT obtained? | Matrix | MP-specific aquatic? | Readiness verdict |
| --- | --- | --- | --- | --- | --- |
| W4391319604 | yes | **Yes** | Flow microfluidic **lab** | Yes (flow MPs) | **Pilot lab only** — portable claim; specialized optics |
| W4391755619 | yes | **Yes** | Coastal field **UV** imagery | Yes (polymer boxes) | **Field demo** — low mAP; UV rig not budget edge |
| W3204790372 | yes | **Yes** | Underwater **macro garbage** | No | **Embeddable CV** for AUV; not MP ID |
| W4400418758 | yes | No | **Lab flume** AI camera | Claimed MP motion | **Unverified** — not open water |
| W4409162823 | yes | No | **Consumer product** phone YOLO | Not aquatic field | **Unverified** — $10 scope; wrong matrix |

**Scorecard:** **4** extraction rows flagged `edge_low_cost=yes`; **3** have obtained FT; **1** (W4391319604) combines obtained FT + MP focus + author cost framing; **0** demonstrate **maintained, open-water, polymer-validated** edge deployment in a resource-limited country.

### 7.2 What “ready” would require (evidence gap)

| Criterion | Best corpus anchor | Gap |
| --- | --- | --- |
| Open FT with metrics | W4391319604 (96% accuracy claim, lab flow) | No field aquatic matrix |
| Field aquatic CV | W4391755619 | Capex/UV protocol; mAP 34–36% |
| Verified bill of materials | — | Not reported in obtained FT |
| &gt;12 months ops / maintenance data | — | Absent |
| LATAM or GS municipal pilot | — | Absent (Section 6) |

Reviews note automation and cost barriers for imaging (W4313826580, W4296114416) while primaries assume **µ-Raman, μFTIR, HSI** (W4366815281, W4200249418)—the opposite of disposable edge hardware.

### 7.3 Tiered readiness model (for procurement)

| Tier | Description | Readiness in corpus | Example `paper_id` |
| --- | --- | --- | --- |
| **A — Lab screening** | Flow or microscope ML on prepared samples | **Moderate** (obtained FT) | W4391319604, W4383534319, W3003736709 |
| **B — Field alert (vision)** | RGB/UV detector without polymer confirm | **Low** (1 MP field FT) | W4391755619 |
| **C — Field litter proxy** | Macro waste/debris CV | **Low–moderate** (metrics strong, task wrong) | W4291123479, W3204790372 |
| **D — Spectroscopy hub** | Ex-situ polymer ID | **High lab performance**, **low edge** | W4200249418, W4366815281 |
| **E — Consumer / phone lab** | On-site extraction + YOLO | **Not verified** | W4409162823 (forward) |
| **F — Real-time flume camera** | Lab simulation | **Not verified** | W4400418758 (forward) |

For **Colombia / Magdalena / Caribbean** programmes (Section 6), **no tier A–F stack is nationally validated**; tier **D + selective B or C** is the most defensible **interim** architecture if capital exists—not “edge-only” MP compliance.

### 7.4 Contradiction: cheap narrative vs instrument reality

- **W4391319604** authors frame **portable/cost-effective** polarization holographic imaging; summary notes **specialized optical hardware** and likely exceed typical LATAM university lab budgets without dedicated funding.  
- **W4391755619** field UV capture enables Faster R-CNN polymer boxes but **low mAP** undermines operational detection (Section 3.5).  
- **W4409162823** abstract cites **~$10** TinyScope attachment for **consumer products**—misaligned with continuous river monitoring.  
- **Phone + YOLO** and **AI camera** papers illustrate **marketing-ready** low-cost narratives ahead of aquatic field replication (GAP-EDGE-01).

### 7.5 Implications and Q10

**Low-cost edge deployment is not evidence-ready** for aquatic MP monitoring in this map. Funders should budget for **spectroscopy access or shared regional hubs** (W4200249418, W4366815281) plus **pilot-scale field CV** with explicit polymer ground-truth—not headline “edge AI” abstracts alone.

[questions.md](../../questions.md) **Q10** (tiered edge + spectroscopy stack): remains a **synthesis proposal**—supported as a **pattern** by separate strong lab spectroscopy (W4200249418) and weak field edge CV (W4391755619), but **not** validated as an integrated product in any obtained full text.

## References

Bibliography (**n = 228**) auto-generated from [`corpus/structured/papers.jsonl`](../../corpus/structured/papers.jsonl) (OpenAlex harvest; search lock **2026-05-18**). Each entry leads with OpenAlex **`paper_id`** for traceability in this map. Regenerate: `python3 scripts/generate-references.py`.

### Full text obtained and synthesised (n = 31)

- **W4391755619** — Thunchanok Thammasanya et al. (2024). A new approach to classifying polymer type of microplastics based on Faster-RCNN-FPN and spectroscopic imagery under ultraviolet light. *Scientific Reports.* https://doi.org/10.1038/s41598-024-53251-5
- **W4393943493** — Yanqi Shi et al. (2024). Analysis of aged microplastics: a review. *Environmental Chemistry Letters.* https://doi.org/10.1007/s10311-024-01731-5 [review]
- **W4391319604** — Yuxing Li et al. (2024). High-throughput microplastic assessment using polarization holographic imaging. *Scientific Reports.* https://doi.org/10.1038/s41598-024-52762-5
- **W4313826580** — Yan Zhang et al. (2023). A Critical Review on Artificial Intelligence—Based Microplastics Imaging Technology: Recent Advances, Hot-Spots and Challenges. *International Journal of Environmental Research and Public Health.* https://doi.org/10.3390/ijerph20021150 [review]
- **W4383534319** — Liyuan Gong et al. (2023). A microfluidic approach for label-free identification of small-sized microplastics in seawater. *Scientific Reports.* https://doi.org/10.1038/s41598-023-37900-9
- **W4385411640** — Uwe Schnepf et al. (2023). A practical primer for image-based particle measurements in microplastic research. *Microplastics and Nanoplastics.* https://doi.org/10.1186/s43591-023-00064-4
- **W4318615471** — Owen Tamin et al. (2023). A review of hyperspectral imaging-based plastic waste detection state-of-the-arts. *International Journal of Power Electronics and Drive Systems/International Journal of Electrical and Computer Engineering.* https://doi.org/10.11591/ijece.v13i3.pp3407-3419 [review]
- **W4380082849** — Miguel M. Duarte & Leonardo Azevedo (2023). Automatic Detection and Identification of Floating Marine Debris Using Multispectral Satellite Imagery. *IEEE Transactions on Geoscience and Remote Sensing.* https://doi.org/10.1109/tgrs.2023.3283607
- **W4382931577** — Lifang Xie et al. (2023). Automatic Identification of Individual Nanoplastics by Raman Spectroscopy Based on Machine Learning. *Environmental Science & Technology.* https://doi.org/10.1021/acs.est.3c03210
- **W4366815281** — Felix Weber et al. (2023). Development of a machine learning-based method for the analysis of microplastics in environmental samples using µ-Raman spectroscopy. *Microplastics and Nanoplastics.* https://doi.org/10.1186/s43591-023-00057-3
- **W4382940669** — Aravin Prince Periyasamy (2023). Environmentally Friendly Approach to the Reduction of Microplastics during Domestic Washing: Prospects for Machine Vision in Microplastics Reduction. *Toxics.* https://doi.org/10.3390/toxics11070575 [review]
- **W4362015000** — Sheela Ramanna et al. (2023). Machine Learning of polymer types from the spectral signature of Raman spectroscopy microplastics data. *Advances in Artificial Intelligence and Machine Learning.* https://doi.org/10.54364/aaiml.2023.1144
- **W4321194910** — Federico Zocco et al. (2023). Towards More Efficient EfficientDets and Real-Time Marine Debris Detection. *IEEE Robotics and Automation Letters.* https://doi.org/10.1109/lra.2023.3245405
- **W4291123479** — Nur Athirah Zailan et al. (2022). An automated solid waste detection using the optimized YOLO model for riverine management. *Frontiers in Public Health.* https://doi.org/10.3389/fpubh.2022.907280
- **W4213300830** — Ludovica Fiore et al. (2022). Classification and distribution of freshwater microplastics along the Italian Po river by hyperspectral imaging. *Environmental Science and Pollution Research.* https://doi.org/10.1007/s11356-022-18501-x
- **W4290026722** — Jiayu Cao et al. (2022). Coronas of micro/nano plastics: a key determinant in their risk assessments. *Particle and Fibre Toxicology.* https://doi.org/10.1186/s12989-022-00492-9 [review]
- **W4205835860** — Katerina Kikaki et al. (2022). MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data. *PLoS ONE.* https://doi.org/10.1371/journal.pone.0262247
- **W4210266455** — Boda Ravi Kiran et al. (2022). Micro/nano-plastics occurrence, identification, risk analysis and mitigation: challenges and perspectives. *Reviews in Environmental Science and Bio/Technology.* https://doi.org/10.1007/s11157-021-09609-6 [review]
- **W4282979647** — Ho-min Park et al. (2022). MP-Net: Deep learning-based segmentation for fluorescence microscopy images of microplastics isolated from clams. *PLoS ONE.* https://doi.org/10.1371/journal.pone.0269449
- **W4308496878** — Julia Sophie Böke et al. (2022). Optical photothermal infrared spectroscopy with simultaneously acquired Raman spectroscopy for two-dimensional microplastic identification. *Scientific Reports.* https://doi.org/10.1038/s41598-022-23318-2
- **W4296114416** — Indrani Chakraborty et al. (2022). Raman spectroscopy for microplastic detection in water sources: a systematic review. *International Journal of Environmental Science and Technology.* https://doi.org/10.1007/s13762-022-04505-0 [review]
- **W4304690559** — Lara Mikac et al. (2022). Surface-enhanced Raman spectroscopy for the detection of microplastics. *Applied Surface Science.* https://doi.org/10.1016/j.apsusc.2022.155239
- **W3204790372** — Hongjie Deng et al. (2021). An Embeddable Algorithm for Automatic Garbage Detection Based on Complex Marine Environment. *Sensors.* https://doi.org/10.3390/s21196391
- **W3172017684** — Claudia Post et al. (2021). Application of Laser-Induced, Deep UV Raman Spectroscopy and Artificial Intelligence in Real-Time Environmental Monitoring—Solutions and First Results. *Sensors.* https://doi.org/10.3390/s21113911
- **W4200249418** — Benedikt Hufnagl et al. (2021). Computer-Assisted Analysis of Microplastics in Environmental Samples Based on μFTIR Imaging in Combination with Machine Learning. *Environmental Science & Technology Letters.* https://doi.org/10.1021/acs.estlett.1c00851
- **W3122508379** — Nitin Agarwala (2021). Managing Marine Environmental Pollution using Artificial Intelligence. *Maritime Technology and Research.* https://doi.org/10.33175/mtr.2021.248053
- **W3134265767** — Stefania Mariano et al. (2021). Micro and Nanoplastics Identification: Classic Methods and Innovative Detection Techniques. *Frontiers in Toxicology.* https://doi.org/10.3389/ftox.2021.636640 [review]
- **W3155690422** — Maria Kremezi et al. (2021). Pansharpening PRISMA Data for Marine Plastic Litter Detection Using Plastic Indexes. *IEEE Access.* https://doi.org/10.1109/access.2021.3073903
- **W3091414454** — Mattis Wolf et al. (2020). Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q). *Environmental Research Letters.* https://doi.org/10.1088/1748-9326/abbd01
- **W3003736709** — Javier Lorenzo-Navarro et al. (2020). SMACC: A System for Microplastics Automatic Counting and Classification. *IEEE Access.* https://doi.org/10.1109/access.2020.2970498
- **W2982912960** — Dierk Raabe et al. (2019). Strategies for improving the sustainability of structural metals. *Nature.* https://doi.org/10.1038/s41586-019-1702-5 [review]

### Forwarded to full text, not retrieved (n = 85)

- **W4415159344** — Justin A. Smolen et al. (2025). Adaptable microplastic classification using similarity learning on µFTIR spectra collected from µFTIR focal plane array imaging. *Proceedings of the National Academy of Sciences.* https://doi.org/10.1073/pnas.2509745122
- **W4417148647** — Yiheng Qin et al. (2025). Advances and innovations in machine learning-based spectral detection methods for trace organic pollutants. *The Analyst.* https://doi.org/10.1039/d5an00903k
- **W4410905772** — Mousumi Khanam et al. (2025). Advances in machine learning for the detection and characterization of microplastics in the environment. *Frontiers in Environmental Science.* https://doi.org/10.3389/fenvs.2025.1573579
- **W4409493738** — Daniel Prezgot et al. (2025). Automated Machine-Learning-Driven Analysis of Microplastics by TGA-FTIR for Enhanced Identification and Quantification. *Analytical Chemistry.* https://doi.org/10.1021/acs.analchem.4c06775
- **W4414305742** — Yuanli Liu et al. (2025). Deep Learning-Based Shape Classification for Hyperspectral-Imaged Microplastics. *Analytical Chemistry.* https://doi.org/10.1021/acs.analchem.5c02683
- **W4409162823** — Md. Zayed Bin Zahir Arju et al. (2025). Deep-learning enabled rapid and low-cost detection of microplastics in consumer products following on-site extraction and image processing. *RSC Advances.* https://doi.org/10.1039/d4ra07991d
- **W4408475533** — Melisa Nyakuchena et al. (2025). Deep-learning-assisted near-infrared hyperspectral imaging for microplastic classification. *Powder Technology.* https://doi.org/10.1016/j.powtec.2025.120933
- **W4412459792** — Qinchen Yang et al. (2025). Detection and classification of microplastics in simulated shoal environments using hyperspectral imaging technology. *Microchemical Journal.* https://doi.org/10.1016/j.microc.2025.114571
- **W4412059425** — Jen‐Tai Lin et al. (2025). Emerging analytical frontiers in microplastic detection: From spectroscopy to smart sensor technologies. *Talanta Open.* https://doi.org/10.1016/j.talo.2025.100514
- **W4408550134** — Octavio Villegas-Camacho et al. (2025). FTIR-Based Microplastic Classification: A Comprehensive Study on Normalization and ML Techniques. *Recycling.* https://doi.org/10.3390/recycling10020046
- **W4409887007** — Lifang Xie et al. (2025). Machine Learning Advancements and Strategies in Microplastic and Nanoplastic Detection. *Environmental Science & Technology.* https://doi.org/10.1021/acs.est.4c11888 [review]
- **W4413416639** — Khurram Shahzad et al. (2025). Machine learning-driven microplastics identification using ensemble stacking with Extra Tree meta-models from FTIR data. *Journal of environmental chemical engineering.* https://doi.org/10.1016/j.jece.2025.118315
- **W4412430014** — Jeonghyun Lim & Dongha Shin (2025). Machine learning-enhanced Raman spectroscopy for fast nanoplastic detection at low SNR. *Sensors and Actuators B Chemical.* https://doi.org/10.1016/j.snb.2025.138316
- **W4406519085** — Ji Woo Jeon et al. (2025). Machine learning-integrated droplet microfluidic system for accurate quantification and classification of microplastics. *Water Research.* https://doi.org/10.1016/j.watres.2025.123161
- **W4408220111** — Anderson Targino da Silva Ferreira et al. (2025). Microplastic Deposits Prediction on Urban Sandy Beaches: Integrating Remote Sensing, GNSS Positioning, µ-Raman Spectroscopy, and Machine Learning Models. *Microplastics.* https://doi.org/10.3390/microplastics4010012
- **W4411766616** — Ayushi Agrawal & Sachin Solanki (2025). Raman Spectroscopy Enhanced By Machine Learning For Effective Microplastic Detection In Aquatic Systems. *International Journal of Environmental Sciences.* https://doi.org/10.64252/ax2y9422
- **W4408994100** — Masashi Tsuchiya et al. (2025). Rapid detection and quantification of Nile Red-stained microplastic particles in sediment samples. *PeerJ.* https://doi.org/10.7717/peerj.19196
- **W4406172157** — Ajinkya Nene et al. (2025). Recent advances and future technologies in nano-microplastics detection. *Environmental Sciences Europe.* https://doi.org/10.1186/s12302-024-01044-y
- **W4410228755** — Vishakha Singh et al. (2025). YOLOv7-Based Microplastic Detection: Crafting a Custom Dataset for Environmental Analysis. https://doi.org/10.1109/ieecon64081.2025.10987670
- **W4400319989** — Rajendran Thavasimuthu et al. (2024). <scp>SegNet</scp>‐<scp>VOLO</scp>model for classifying microplastic contaminants in water bodies. *Polymers for Advanced Technologies.* https://doi.org/10.1002/pat.6497
- **W4392657594** — David Plazas et al. (2024). A Study of High-Frequency Noise for Microplastics Classification Using Raman Spectroscopy and Machine Learning. *Applied Spectroscopy.* https://doi.org/10.1177/00037028241233304
- **W4400883495** — Yanmin Zhu et al. (2024). Advanced Optical Imaging Technologies for Microplastics Identification: Progress and Challenges. *Advanced Photonics Research.* https://doi.org/10.1002/adpr.202400038
- **W4404459247** — Brian Coleman (2024). An introduction to machine learning tools for the analysis of microplastics in complex matrices. *Environmental Science Processes & Impacts.* https://doi.org/10.1039/d4em00605d [review]
- **W4400366361** — Yihao Zhang et al. (2024). Artificial Intelligence-Based Microfluidic Platform for Detecting Contaminants in Water: A Review. *Sensors.* https://doi.org/10.3390/s24134350 [review]
- **W4403014824** — Silvia Serranti et al. (2024). Efficient microplastic identification by hyperspectral imaging: A comparative study of spatial resolutions, spectral ranges and classification models to define an optimal analytical protocol. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2024.176630
- **W4403561715** — Pensiri Akkajit et al. (2024). Enhanced detection and classification of microplastics in marine environments using deep learning. *Regional Studies in Marine Science.* https://doi.org/10.1016/j.rsma.2024.103880
- **W4403739326** — Nelle Meyers et al. (2024). From microplastics to pixels: testing the robustness of two machine learning approaches for automated, Nile red-based marine microplastic identification. *Environmental Science and Pollution Research.* https://doi.org/10.1007/s11356-024-35289-0
- **W4396828529** — Ismaila Abimbola et al. (2024). In-situ detection of microplastics in the aquatic environment: A systematic literature review. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2024.173111 [review]
- **W4393123575** — Megha Sunil et al. (2024). Machine learning assisted Raman spectroscopy: A viable approach for the detection of microplastics. *Journal of Water Process Engineering.* https://doi.org/10.1016/j.jwpe.2024.105150
- **W4405490729** — Rini Khamimatul Ula et al. (2024). Machine Learning Method for Microplastic Identification Using a Combination of Machine Learning and Raman Spectroscopy. https://doi.org/10.1109/isct62336.2024.10791228
- **W4405925464** — Frithjof Herb et al. (2024). Machine learning outperforms humans in microplastic characterization and reveals human labelling errors in FTIR data. *Journal of Hazardous Materials.* https://doi.org/10.1016/j.jhazmat.2024.136989
- **W4403492962** — Roberta Palmieri et al. (2024). Marine Microplastic Classification by Hyperspectral Imaging: Case Studies from the Mediterranean Sea, the Strait of Gibraltar, the Western Atlantic Ocean and the Bay of Biscay. *Applied Sciences.* https://doi.org/10.3390/app14209310
- **W4409917643** — Prashanthi N. Thota et al. (2024). Microplastic Detection in Drinking Water: A Comparative Analysis of CNN-SVM and CNN-RF Hybrid Models. https://doi.org/10.1109/ocit65031.2024.00014
- **W4391637830** — Sayo O. Fakayode et al. (2024). Microplastics: Challenges, toxicity, spectroscopic and real-time detection methods. *Applied Spectroscopy Reviews.* https://doi.org/10.1080/05704928.2024.2311130
- **W4399087900** — Olga Guselnikova et al. (2024). Pretreatment-free SERS sensing of microplastics using a self-attention-based neural network on hierarchically porous Ag foams. *Nature Communications.* https://doi.org/10.1038/s41467-024-48148-w
- **W4393127100** — Yinlong Luo et al. (2024). Quantitative analysis of microplastics in water environments based on Raman spectroscopy and convolutional neural network. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2024.171925
- **W4390658983** — Naixin Qian et al. (2024). Rapid single-particle chemical imaging of nanoplastics by SRS microscopy. *Proceedings of the National Academy of Sciences.* https://doi.org/10.1073/pnas.2300582121
- **W4400418758** — Md Abdul Baset Sarker et al. (2024). Real-Time Detection of Microplastics Using an AI Camera. *Sensors.* https://doi.org/10.3390/s24134394
- **W4404688861** — Jean-Hughes Fournier-Lupien et al. (2024). Toward in Situ Identification of Microplastics in Water Using Raman Spectroscopy and Machine Learning. https://doi.org/10.1109/oceans55160.2024.10754063
- **W4394767335** — Jin Zhu et al. (2024). YOLOv8-C2f-Faster-EMA: An Improved Underwater Trash Detection Model Based on YOLOv8. *Sensors.* https://doi.org/10.3390/s24082483
- **W4399529946** — N Shivaanivarsha et al. (2024). “WAVECLEAN” – An Innovation in Autonomous Vessel Driving Using Object Tracking and Collection of Floating Debris. https://doi.org/10.1109/ic3iot60841.2024.10550352
- **W4389933787** — Aeint Shune Thar et al. (2023). A Comparative Study of Machine Learning and Deep Learning Models for Microplastic Classification using FTIR Spectra. https://doi.org/10.1109/isai-nlp60301.2023.10354812
- **W4386607939** — Yì Wáng et al. (2023). A Review on Applications of Artificial Intelligence in Wastewater Treatment. *Sustainability.* https://doi.org/10.3390/su151813557 [review]
- **W4387344131** — Stephanie Wright et al. (2023). Application of Infrared and Near-Infrared Microspectroscopy to Microplastic Human Exposure Measurements. *Applied Spectroscopy.* https://doi.org/10.1177/00037028231199772 [review]
- **W4383070078** — Xinyu Yan et al. (2023). FRDA: Fingerprint Region based Data Augmentation using explainable AI for FTIR based microplastics classification. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2023.165340
- **W4385603585** — Vlatka Mikulec et al. (2023). Green Techniques for Detecting Microplastics in Marine with Emphasis on FTIR and NIR Spectroscopy—Short Review. *Processes.* https://doi.org/10.3390/pr11082360
- **W4315641711** — Xin Tian et al. (2023). Identification of Polymers with a Small Data Set of Mid-infrared Spectra: A Comparison between Machine Learning and Deep Learning Models. *Environmental Science & Technology Letters.* https://doi.org/10.1021/acs.estlett.2c00949
- **W4385543194** — Reaha Goyetche et al. (2023). Issues with the detection and classification of microplastics in marine sediments with chemical imaging and machine learning. *TrAC Trends in Analytical Chemistry.* https://doi.org/10.1016/j.trac.2023.117221
- **W4388498851** — Marc Rußwurm et al. (2023). Large-scale detection of marine debris in coastal areas with Sentinel-2. *iScience.* https://doi.org/10.1016/j.isci.2023.108402
- **W4366962764** — Jordi Valls Conesa et al. (2023). Random forest microplastic classification using spectral subsamples of FT-IR hyperspectral images. *Analytical Methods.* https://doi.org/10.1039/d3ay00514c
- **W4385454320** — B Corrigan et al. (2023). Real-Time Instance Segmentation for Detection of Underwater Litter as a Plastic Source. *Journal of Marine Science and Engineering.* https://doi.org/10.3390/jmse11081532
- **W4367055932** — Yaping Qi et al. (2023). Recent Progresses in Machine Learning Assisted Raman Spectroscopy. *Advanced Optical Materials.* https://doi.org/10.1002/adom.202203104
- **W4385737119** — Fan Liu et al. (2023). Shapes of Hyperspectral Imaged Microplastics. *Environmental Science & Technology.* https://doi.org/10.1021/acs.est.3c03517
- **W4225808234** — Xinyu Yan et al. (2022). An Ensemble Machine Learning Method for Microplastics Identification with Ftir Spectrum. *SSRN Electronic Journal.* https://doi.org/10.2139/ssrn.4059945
- **W4283211652** — Xinyu Yan et al. (2022). An ensemble machine learning method for microplastics identification with FTIR spectrum. *Journal of environmental chemical engineering.* https://doi.org/10.1016/j.jece.2022.108130
- **W4310289268** — Benjamin Lei et al. (2022). Customizable Machine-Learning Models for Rapid Microplastic Identification Using Raman Microscopy. *Analytical Chemistry.* https://doi.org/10.1021/acs.analchem.2c02451
- **W4280593121** — S. Veerasingam et al. (2022). Detection and assessment of marine litter in an uninhabited island, Arabian Gulf: A case study with conventional and machine learning approaches. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2022.156064
- **W4283579124** — Nisha Maharjan et al. (2022). Detection of River Plastic Using UAV Sensor Data and Deep Learning. *Remote Sensing.* https://doi.org/10.3390/rs14133049
- **W4294957642** — Ilnur Ishmukhametov et al. (2022). Identification of micro- and nanoplastics released from medical masks using hyperspectral imaging and deep learning. *The Analyst.* https://doi.org/10.1039/d2an01139e
- **W4292417018** — Madhuraj Palat Kannankai et al. (2022). Machine learning aided meta-analysis of microplastic polymer composition in global marine environment. *Journal of Hazardous Materials.* https://doi.org/10.1016/j.jhazmat.2022.129801 [review]
- **W4285107929** — Iza Sazanita Isa et al. (2022). Optimizing the Hyperparameter Tuning of YOLOv5 for Underwater Detection. *IEEE Access.* https://doi.org/10.1109/access.2022.3174583
- **W4310048683** — Yinlong Luo et al. (2022). Raman Spectroscopy and Machine Learning for Microplastics Identification and Classification in Water Environments. *IEEE Journal of Selected Topics in Quantum Electronics.* https://doi.org/10.1109/jstqe.2022.3222065
- **W4292363386** — Jia-yu Lin et al. (2022). Recent advances in the application of machine learning methods to improve identification of the microplastics in environment. *Chemosphere.* https://doi.org/10.1016/j.chemosphere.2022.136092 [review]
- **W4295532849** — Wahid Ali Hamood Altowayti et al. (2022). The Role of Conventional Methods and Artificial Intelligence in the Wastewater Treatment: A Comprehensive Review. *Processes.* https://doi.org/10.3390/pr10091832 [review]
- **W3157156599** — Cristiane Vidal & Célio Pasquini (2021). A comprehensive and fast microplastics identification based on near-infrared hyperspectral imaging (HSI-NIR) and chemometrics. *Environmental Pollution.* https://doi.org/10.1016/j.envpol.2021.117251
- **W3120546273** — Odei Garcia‐Garin et al. (2021). Automatic detection and quantification of floating marine macro-litter in aerial images: Introducing a novel deep learning approach connected to a web application in R. *Environmental Pollution.* https://doi.org/10.1016/j.envpol.2021.116490
- **W3176625739** — Ivana Marin et al. (2021). Deep-Feature-Based Approach to Marine Debris Classification. *Applied Sciences.* https://doi.org/10.3390/app11125644
- **W3198640376** — Bing Xue et al. (2021). Deep-Sea Debris Identification Using Deep Convolutional Neural Networks. *IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.* https://doi.org/10.1109/jstars.2021.3107853
- **W3157707524** — Tomo Kitahashi et al. (2021). Development of robust models for rapid classification of microplastic polymer types based on near infrared hyperspectral images. *Analytical Methods.* https://doi.org/10.1039/d1ay00110h
- **W3188289237** — Amy Lusher et al. (2021). Moving forward in microplastic research: A Norwegian perspective. *Environment International.* https://doi.org/10.1016/j.envint.2021.106794 [review]
- **W3196128465** — Henrique de Medeiros Back et al. (2021). Training and evaluating machine learning algorithms for ocean microplastics classification through vibrational spectroscopy. *Chemosphere.* https://doi.org/10.1016/j.chemosphere.2021.131903
- **W3214223885** — Fabio Corradini et al. (2021). uFTIR: An R package to process hyperspectral images of environmental samples captured with <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" id="d1e462" altimg="si2.svg"> <mml:mi>μ</mml:mi> </mml:math> FTIR microscopes. *SoftwareX.* https://doi.org/10.1016/j.softx.2021.100857
- **W3024533279** — Gordana Jakovljević et al. (2020). A Deep Learning Model for Automatic Plastic Mapping Using Unmanned Aerial Vehicle (UAV) Data. *Remote Sensing.* https://doi.org/10.3390/rs12091515
- **W3006187093** — Yulan Zhang et al. (2020). Atmospheric microplastics: A review on current status and perspectives. *Earth-Science Reviews.* https://doi.org/10.1016/j.earscirev.2020.103118 [review]
- **W3087291236** — Vitor Hugo da Silva et al. (2020). Classification and Quantification of Microplastics (&lt;100 μm) Using a Focal Plane Array–Fourier Transform Infrared Imaging System and Machine Learning. *Analytical Chemistry.* https://doi.org/10.1021/acs.analchem.0c01324
- **W3033507295** — Win Cowger et al. (2020). Critical Review of Processing and Classification Techniques for Images and Spectra in Microplastic Research. *Applied Spectroscopy.* https://doi.org/10.1177/0003702820929064
- **W3094752424** — Javier Lorenzo-Navarro et al. (2020). Deep learning approach for automatic microplastics counting and classification. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2020.142728
- **W3110966552** — Melissa Wagner & Tincuta Heinzel (2020). Human Perceptions of Recycled Textiles and Circular Fashion: A Systematic Literature Review. *Sustainability.* https://doi.org/10.3390/su122410599 [review]
- **W3036200362** — Hui Huang et al. (2020). Hyperspectral Imaging as a Potential Online Detection Method of Microplastics. *Bulletin of Environmental Contamination and Toxicology.* https://doi.org/10.1007/s00128-020-02902-0 [review]
- **W3032111109** — Yiyang Chen et al. (2020). Identification and quantification of microplastics using Fourier-transform infrared spectroscopy: Current status and future prospects. *Current Opinion in Environmental Science & Health.* https://doi.org/10.1016/j.coesh.2020.05.004
- **W3048906249** — Gil Gonçalves et al. (2020). Quantifying Marine Macro Litter Abundance on a Sandy Beach Using Unmanned Aerial Systems and Object-Oriented Machine Learning Methods. *Remote Sensing.* https://doi.org/10.3390/rs12162599
- **W2952839204** — Mikaël Kedzierski et al. (2019). A machine learning algorithm for high throughput identification of FTIR spectra: Application on microplastics collected in the Mediterranean Sea. *Chemosphere.* https://doi.org/10.1016/j.chemosphere.2019.05.113
- **W2936115560** — Yituo Zhang et al. (2019). Hyperspectral Imaging Based Method for Rapid Detection of Microplastics in the Intestinal Tracts of Fish. *Environmental Science & Technology.* https://doi.org/10.1021/acs.est.8b07321
- **W2983405728** — Silvia Serranti et al. (2019). Microplastics characterization by hyperspectral imaging in the SWIR range. https://doi.org/10.1117/12.2542793
- **W2981466468** — Jun-ichiro Watanabe et al. (2019). Underwater and airborne monitoring of marine ecosystems and debris. *Journal of Applied Remote Sensing.* https://doi.org/10.1117/1.jrs.13.044509

### Excluded at title/abstract screening (n = 112)

- **W7125597175** — B. Sudarshan Acharya et al. (2026). Quality Analysis and Detection of Adulterants and Contaminations in Milk/Milk Powder by Raman Spectroscopy. *Comprehensive Reviews in Food Science and Food Safety.* https://doi.org/10.1111/1541-4337.70403
- **W4407139393** — Umme Thayyiba Khatoon & Aditya Velidandi (2025). An Overview on the Role of Government Initiatives in Nanotechnology Innovation for Sustainable Economic Development and Research Progress. *Sustainability.* https://doi.org/10.3390/su17031250
- **W4415900854** — Rahul Joshi et al. (2025). Detection and classification of microplastics in green tea using SERS with gold nanoparticle substrates integrating chemometrics and deep learning. *Current Research in Food Science.* https://doi.org/10.1016/j.crfs.2025.101235
- **W4408680321** — Michael Seidel et al. (2025). Efficient screening of microplastics in soils using hyperspectral imaging in the short-wave infrared range coupled with machine learning – A laboratory-based experiment. *Ecological Indicators.* https://doi.org/10.1016/j.ecolind.2025.113301
- **W4406219082** — Chunhui Xie et al. (2025). Machine Learning Approaches in Polymer Science: Progress and Fundamental for a New Paradigm. *SmartMat.* https://doi.org/10.1002/smm2.1320
- **W4409727181** — Ming-Fang Cheng et al. (2025). Modern Trends and Recent Applications of Hyperspectral Imaging: A Review. *Technologies.* https://doi.org/10.3390/technologies13050170 [review]
- **W4411450449** — Yanping Zhu et al. (2025). Study on the identification of microplastics in agricultural soils using segmented GASF combined with deep learning based on confocal micro-Raman spectroscopy. *Measurement.* https://doi.org/10.1016/j.measurement.2025.118226
- **W4409406638** — Unknown authors (2025). YOLO-CE for Accurate Livestock Detection in Challenging Landfill Environments. *International journal of intelligent engineering and systems.* https://doi.org/10.22266/ijies2025.0531.51
- **W4390880644** — Xiaohui Yan et al. (2024). A Comprehensive Review of Machine Learning for Water Quality Prediction over the Past Five Years. *Journal of Marine Science and Engineering.* https://doi.org/10.3390/jmse12010159 [review]
- **W4390604011** — Yihao Zhou et al. (2024). A multimodal magnetoelastic artificial skin for underwater haptic sensing. *Science Advances.* https://doi.org/10.1126/sciadv.adj8567
- **W4390939303** — Md. Mosarrof Hossen et al. (2024). A Reliable and Robust Deep Learning Model for Effective Recyclable Waste Classification. *IEEE Access.* https://doi.org/10.1109/access.2024.3354774
- **W4392716026** — Rok Pučnik et al. (2024). A waste separation system based on sensor technology and deep learning: A simple approach applied to a case study of plastic packaging waste. *Journal of Cleaner Production.* https://doi.org/10.1016/j.jclepro.2024.141762
- **W4392747860** — Ismail Essamlali et al. (2024). Advances in machine learning and IoT for water quality monitoring: A comprehensive review. *Heliyon.* https://doi.org/10.1016/j.heliyon.2024.e27920 [review]
- **W4401332314** — Edgar Ovidio Barrón Ramos et al. (2024). Application of Machine Learning in Plastic Waste Detection and Classification: A Systematic Review. *Processes.* https://doi.org/10.3390/pr12081632 [review]
- **W4391968719** — Simona Mariana Popescu et al. (2024). Artificial intelligence and IoT driven technologies for environmental pollution monitoring and management. *Frontiers in Environmental Science.* https://doi.org/10.3389/fenvs.2024.1336088
- **W4392545756** — Christian Ebere Enyoh & Qingyue Wang (2024). Automated Classification of Undegraded and Aged Polyethylene Terephthalate Microplastics from ATR-FTIR Spectroscopy using Machine Learning Algorithms. *Journal of Polymers and the Environment.* https://doi.org/10.1007/s10924-024-03199-4
- **W4390728072** — Aboi Igwaran et al. (2024). Cyanobacteria Harmful Algae Blooms: Causes, Impacts, and Risk Management. *Water Air & Soil Pollution.* https://doi.org/10.1007/s11270-023-06782-y
- **W4390751491** — Yazhou Qin et al. (2024). Deep learning analysis for rapid detection and classification of household plastics based on Raman spectroscopy. *Spectrochimica Acta Part A Molecular and Biomolecular Spectroscopy.* https://doi.org/10.1016/j.saa.2024.123854
- **W4393337124** — Mohamed Fadhlallah Guerri et al. (2024). Deep learning techniques for hyperspectral image analysis in agriculture: A review. *ISPRS Open Journal of Photogrammetry and Remote Sensing.* https://doi.org/10.1016/j.ophoto.2024.100062 [review]
- **W4405895621** — Nohyeong Jeong et al. (2024). Elucidating governing factors of PFAS removal by polyamide membranes using machine learning and molecular simulations. *Nature Communications.* https://doi.org/10.1038/s41467-024-55320-9
- **W4393183551** — Chloé Gicquel et al. (2024). Generation of synthetic FTIR spectra to facilitate chemical identification of microplastics. *Marine Pollution Bulletin.* https://doi.org/10.1016/j.marpolbul.2024.116295
- **W4390475396** — Kaiqiang Wang et al. (2024). On the use of deep learning for phase recovery. *Light Science & Applications.* https://doi.org/10.1038/s41377-023-01340-x [review]
- **W4398762811** — B. Thangagiri & R. Sivakumar (2024). Prospective Application of Artificial Intelligence Towards the Detection, and Classifications of Microplastics with Bibliometric Analysis. *Water Air & Soil Pollution.* https://doi.org/10.1007/s11270-024-07151-z
- **W4405372748** — Shruti Gupta et al. (2024). Recent Developments in Recirculating Aquaculture Systems: A Review. *Aquaculture Research.* https://doi.org/10.1155/are/6096671 [review]
- **W4393306481** — M. K. Nallakaruppan et al. (2024). Reliable water quality prediction and parametric analysis using explainable AI models. *Scientific Reports.* https://doi.org/10.1038/s41598-024-56775-y
- **W4404968855** — Tariq Ali et al. (2024). Smart agriculture: utilizing machine learning and deep learning for drought stress identification in crops. *Scientific Reports.* https://doi.org/10.1038/s41598-024-74127-8
- **W4402521085** — Svetlana N. Khonina et al. (2024). Synergy between Artificial Intelligence and Hyperspectral Imagining—A Review. *Technologies.* https://doi.org/10.3390/technologies12090163
- **W4400653066** — Jia Ning et al. (2024). The Diversity of Artificial Intelligence Applications in Marine Pollution: A Systematic Literature Review. *Journal of Marine Science and Engineering.* https://doi.org/10.3390/jmse12071181 [review]
- **W4399091078** — Nutcha Taneepanichskul et al. (2024). Using hyperspectral imaging to identify and classify large microplastic contamination in industrial composting processes. *Frontiers in Sustainability.* https://doi.org/10.3389/frsus.2024.1332163
- **W4386170962** — Khandoker Samaher Salem et al. (2023). A critical review of existing and emerging technologies and systems to optimize solid waste management for feedstocks and energy conversion. *Matter.* https://doi.org/10.1016/j.matt.2023.08.003 [review]
- **W4386325177** — Guannan Huang et al. (2023). Application of Machine Learning in Material Synthesis and Property Prediction. *Materials.* https://doi.org/10.3390/ma16175977 [review]
- **W4382182093** — Bingbing Fang et al. (2023). Artificial intelligence for waste management in smart cities: a review. *Environmental Chemistry Letters.* https://doi.org/10.1007/s10311-023-01604-3 [review]
- **W4390155689** — Adedayo Adefemi et al. (2023). Artificial intelligence in environmental health and public safety: A comprehensive review of USA strategies. *World Journal of Advanced Research and Reviews.* https://doi.org/10.30574/wjarr.2023.20.3.2591 [review]
- **W4323830100** — Zikang Feng et al. (2023). Classification of household microplastics using a multi-model approach based on Raman spectroscopy. *Chemosphere.* https://doi.org/10.1016/j.chemosphere.2023.138312
- **W4315881621** — Atif Khurshid Wani et al. (2023). Discovering untapped microbial communities through metagenomics for microplastic remediation: recent advances, challenges, and way forward. *Environmental Science and Pollution Research.* https://doi.org/10.1007/s11356-023-25192-5 [review]
- **W4317425328** — Ηλίας Δρίτσας & Μαρία Τρίγκα (2023). Efficient Data-Driven Machine Learning Models for Water Quality Prediction. *Computation.* https://doi.org/10.3390/computation11020016
- **W4360603582** — Daniel Tong et al. (2023). Health and Safety Effects of Airborne Soil Dust in the Americas and Beyond. *Reviews of Geophysics.* https://doi.org/10.1029/2021rg000763
- **W4383343712** — Thibaut Van Acker et al. (2023). Inductively coupled plasma mass spectrometry. *Nature Reviews Methods Primers.* https://doi.org/10.1038/s43586-023-00235-w
- **W4366548079** — Nereida Rodriguez-Alvarez et al. (2023). Latest Advances in the Global Navigation Satellite System—Reflectometry (GNSS-R) Field. *Remote Sensing.* https://doi.org/10.3390/rs15082157
- **W4365448739** — Răzvan Bogdan et al. (2023). Low-Cost Internet-of-Things Water-Quality Monitoring System for Rural Areas. *Sensors.* https://doi.org/10.3390/s23083919
- **W4387140887** — Peter Rubbens et al. (2023). Machine learning in marine ecology: an overview of techniques and applications. *ICES Journal of Marine Science.* https://doi.org/10.1093/icesjms/fsad100
- **W4387264069** — Sabastian Simbarashe Mukonza & Jie‐Lun Chiang (2023). Meta-Analysis of Satellite Observations for United Nations Sustainable Development Goals: Exploring the Potential of Machine Learning for Water Quality Monitoring. *Environments.* https://doi.org/10.3390/environments10100170 [review]
- **W4386416320** — Ren-Shou Yu & Sher Singh (2023). Microplastic Pollution: Threats and Impacts on Global Marine Ecosystems. *Sustainability.* https://doi.org/10.3390/su151713252
- **W4388571175** — Daniele la Cecilia et al. (2023). Microplastics attenuation from surface water to drinking water: Impact of treatment and managed aquifer recharge – and identification uncertainties. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2023.168378
- **W4324091125** — Xiaobo Li et al. (2023). Polarimetric Imaging via Deep Learning: A Review. *Remote Sensing.* https://doi.org/10.3390/rs15061540 [review]
- **W4315647674** — Samantha Phan & Christine K. Luscombe (2023). Recent trends in marine microplastic modeling and machine learning tools: Potential for long-term microplastic monitoring. *Journal of Applied Physics.* https://doi.org/10.1063/5.0126358
- **W4378832918** — Vijay Kumar Gugulothu & Sai A. Balaji (2023). RETRACTED ARTICLE: An early prediction and classification of lung nodule diagnosis on CT images based on hybrid deep learning techniques. *Multimedia Tools and Applications.* https://doi.org/10.1007/s11042-023-15802-2
- **W4386503277** — Aravin Prince Periyasamy & Saravanan Periyasami (2023). Rise of digital fashion and metaverse: influence on sustainability. *Digital Economy and Sustainable Development.* https://doi.org/10.1007/s44265-023-00016-z
- **W4317358628** — Caleb Kruse et al. (2023). Satellite monitoring of terrestrial plastic waste. *PLoS ONE.* https://doi.org/10.1371/journal.pone.0278997
- **W4380153990** — Lijia Xu et al. (2023). Study on detection method of microplastics in farmland soil based on hyperspectral imaging technology. *Environmental Research.* https://doi.org/10.1016/j.envres.2023.116389
- **W4319942604** — Stefano Abbate et al. (2023). Sustainability trends and gaps in the textile, apparel and fashion industries. *Environment Development and Sustainability.* https://doi.org/10.1007/s10668-022-02887-2 [review]
- **W4386275778** — Reza Mohammadi Asiyabi et al. (2023). Synthetic Aperture Radar (SAR) for Ocean: A Review. *IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.* https://doi.org/10.1109/jstars.2023.3310363 [review]
- **W4365812881** — Yunzhe Li et al. (2023). The application of machine learning to air pollution research: A bibliometric analysis. *Ecotoxicology and Environmental Safety.* https://doi.org/10.1016/j.ecoenv.2023.114911
- **W4385722532** — Jidong Zhao et al. (2023). The role of particle shape in computational modelling of granular matter. *Nature Reviews Physics.* https://doi.org/10.1038/s42254-023-00617-9 [review]
- **W4292410749** — Nutcha Taneepanichskul et al. (2022). A Review of Sorting and Separating Technologies Suitable for Compostable and Biodegradable Plastic Packaging. *Frontiers in Sustainability.* https://doi.org/10.3389/frsus.2022.901885 [review]
- **W4294794016** — Zhilong Kang et al. (2022). Advances in Machine Learning and Hyperspectral Imaging in the Food Supply Chain. *Food Engineering Reviews.* https://doi.org/10.1007/s12393-022-09322-2 [review]
- **W4309458231** — Claudia Campanale et al. (2022). An Overall Perspective for the Study of Emerging Contaminants in Karst Aquifers. *Resources.* https://doi.org/10.3390/resources11110105
- **W4311421434** — Wenjie Ai et al. (2022). Application of hyperspectral and deep learning in farmland soil microplastic detection. *Journal of Hazardous Materials.* https://doi.org/10.1016/j.jhazmat.2022.130568
- **W4205478103** — Wen Jiang et al. (2022). Artificial Neural Networks and Deep Learning Techniques Applied to Radar Target Detection: A Review. *Electronics.* https://doi.org/10.3390/electronics11010156 [review]
- **W4307290447** — Edward Ren Kai Neo et al. (2022). Deep learning for chemometric analysis of plastic spectral data from infrared and Raman databases. *Resources Conservation and Recycling.* https://doi.org/10.1016/j.resconrec.2022.106718
- **W4280596693** — Ilaria Savino et al. (2022). Effects and Impacts of Different Oxidative Digestion Treatments on Virgin and Aged Microplastic Particles. *Polymers.* https://doi.org/10.3390/polym14101958
- **W4308902188** — Abdo Hassoun et al. (2022). Food processing 4.0: Current and future developments spurred by the fourth industrial revolution. *Food Control.* https://doi.org/10.1016/j.foodcont.2022.109507
- **W4292343549** — Shaik Vaseem Akram et al. (2022). Implementation of Digitalized Technologies for Fashion Industry 4.0: Opportunities and Challenges. *Scientific Programming.* https://doi.org/10.1155/2022/7523246
- **W4310871764** — Yue Hao et al. (2022). Improved detection and counting performance of microplastics in common carp whole blood by an attention-guided deep learning method. https://doi.org/10.1117/12.2656028
- **W4221105061** — Mostafa Bigdeli et al. (2022). Lagrangian Modeling of Marine Microplastics Fate and Transport: The State of the Science. *Journal of Marine Science and Engineering.* https://doi.org/10.3390/jmse10040481
- **W4287958419** — Hongwei Ning et al. (2022). Machine learning for microalgae detection and utilization. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2022.947394
- **W4308702797** — Saravanan Periyasami & Aravin Prince Periyasamy (2022). Metaverse as Future Promising Platform Business Model: Case Study on Fashion Value Chain. *Businesses.* https://doi.org/10.3390/businesses2040033
- **W4307342666** — Tapati Roy et al. (2022). Microplastic/nanoplastic toxicity in plants: an imminent concern. *Environmental Monitoring and Assessment.* https://doi.org/10.1007/s10661-022-10654-z [review]
- **W4307726487** — Samet Ozturk et al. (2022). Near-infrared spectroscopy and machine learning for classification of food powders during a continuous process. *Journal of Food Engineering.* https://doi.org/10.1016/j.jfoodeng.2022.111339
- **W4307372666** — Zhiqiang Gao et al. (2022). On airborne tire wear particles along roads with different traffic characteristics using passive sampling and optical microscopy, single particle SEM/EDX, and µ-ATR-FTIR analyses. *Frontiers in Environmental Science.* https://doi.org/10.3389/fenvs.2022.1022697
- **W4306835696** — Zhenxing Cai et al. (2022). Research on Waste Plastics Classification Method Based on Multi-Scale Feature Fusion. *Sensors.* https://doi.org/10.3390/s22207974
- **W4224243654** — Lei Kou et al. (2022). Review on Monitoring, Operation and Maintenance of Smart Offshore Wind Farms. *Sensors.* https://doi.org/10.3390/s22082822 [review]
- **W4289315315** — Ranjith Dinakaran et al. (2022). Robust and Fair Undersea Target Detection with Automated Underwater Vehicles for Biodiversity Data Collection. *Remote Sensing.* https://doi.org/10.3390/rs14153680
- **W4291825467** — Rebecca Ruckdashel et al. (2022). Smart E-Textiles: Overview of Components and Outlook. *Sensors.* https://doi.org/10.3390/s22166055 [review]
- **W4290930152** — Alissa H. Tophinke et al. (2022). Systematic development of extraction methods for quantitative microplastics analysis in soils using metal-doped plastics. *Environmental Pollution.* https://doi.org/10.1016/j.envpol.2022.119933
- **W4294628862** — Nick Chater & George Loewenstein (2022). The i-frame and the s-frame: How focusing on individual-level solutions has led behavioral public policy astray. *Behavioral and Brain Sciences.* https://doi.org/10.1017/s0140525x22002023
- **W4308798344** — Yunchao Xie et al. (2022). Toward autonomous laboratories: Convergence of artificial intelligence and experimental automation. *Progress in Materials Science.* https://doi.org/10.1016/j.pmatsci.2022.101043
- **W3168386798** — Iván Palomares et al. (2021). A panoramic view and swot analysis of artificial intelligence for achieving the sustainable development goals by 2030: progress and prospects. *Applied Intelligence.* https://doi.org/10.1007/s10489-021-02264-y
- **W3171119212** — Dário Passos & Puneet Mishra (2021). An automated deep learning pipeline based on advanced optimisations for leveraging spectral classification modelling. *Chemometrics and Intelligent Laboratory Systems.* https://doi.org/10.1016/j.chemolab.2021.104354
- **W3207695224** — Wenjie Ai et al. (2021). Application of hyperspectral imaging technology in the rapid identification of microplastics in farmland soil. *The Science of The Total Environment.* https://doi.org/10.1016/j.scitotenv.2021.151030
- **W3128206789** — Shaobo Luo et al. (2021). Deep <scp>learning‐enabled</scp> imaging flow cytometry for <scp>high‐speed</scp><i>Cryptosporidium</i> and <i>Giardia</i> detection. *Cytometry Part A.* https://doi.org/10.1002/cyto.a.24321
- **W3213044275** — Jean‐Paul Lange (2021). Managing Plastic Waste─Sorting, Recycling, Disposal, and Product Redesign. *ACS Sustainable Chemistry & Engineering.* https://doi.org/10.1021/acssuschemeng.1c05013
- **W3158766448** — Runmin Liu et al. (2021). Multiscale Dense Cross-Attention Mechanism with Covariance Pooling for Hyperspectral Image Scene Classification. *Mobile Information Systems.* https://doi.org/10.1155/2021/9962057
- **W3133407985** — Kerry Cawse‐Nicholson et al. (2021). NASA's surface biology and geology designated observable: A perspective on surface imaging algorithms. *Remote Sensing of Environment.* https://doi.org/10.1016/j.rse.2021.112349
- **W3129228121** — Mattia Delli Priscoli et al. (2021). Neuroblastoma Cells Classification Through Learning Approaches by Direct Analysis of Digital Holograms. *IEEE Journal of Selected Topics in Quantum Electronics.* https://doi.org/10.1109/jstqe.2021.3059532
- **W3127590788** — Ana Rotter et al. (2021). The Essentials of Marine Biotechnology. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2021.629629
- **W4200315900** — Martin Calisto Friant et al. (2021). Transition to a Sustainable Circular Plastics Economy in The Netherlands: Discourse and Policy Analysis. *Sustainability.* https://doi.org/10.3390/su14010190
- **W3177853603** — Pablo Otero et al. (2021). Twitter data analysis to assess the interest of citizens on the impact of marine plastic pollution. *Marine Pollution Bulletin.* https://doi.org/10.1016/j.marpolbul.2021.112620
- **W3048828116** — Cuiping Shi et al. (2020). A Novel Multi-Branch Channel Expansion Network for Garbage Image Classification. *IEEE Access.* https://doi.org/10.1109/access.2020.3016116
- **W3078750124** — Claudia Campanale et al. (2020). A Practical Overview of Methodologies for Sampling and Analysis of Microplastics in Riverine Environments. *Sustainability.* https://doi.org/10.3390/su12176755
- **W3111864906** — Eleni Iacovidou et al. (2020). A systems thinking approach to understanding the challenges of achieving the circular economy. *Environmental Science and Pollution Research.* https://doi.org/10.1007/s11356-020-11725-9
- **W3045345459** — Zhuang Kang et al. (2020). An Automatic Garbage Classification System Based on Deep Learning. *IEEE Access.* https://doi.org/10.1109/access.2020.3010496
- **W3037818147** — Fritz A. Francisco et al. (2020). High-resolution, non-invasive animal tracking and reconstruction of local environment in aquatic ecosystems. *Movement Ecology.* https://doi.org/10.1186/s40462-020-00214-w
- **W3091477941** — Fantina Madricardo et al. (2020). How to Deal With Seafloor Marine Litter: An Overview of the State-of-the-Art and Future Perspectives. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2020.505134
- **W3037523865** — Jinnuo Zhang et al. (2020). Identification of Bacterial Blight Resistant Rice Seeds Using Terahertz Imaging and Hyperspectral Imaging Combined With Convolutional Neural Network. *Frontiers in Plant Science.* https://doi.org/10.3389/fpls.2020.00821
- **W3111915298** — Omid Ghorbanzadeh et al. (2020). Landslide Mapping Using Two Main Deep-Learning Convolution Neural Network Streams Combined by the Dempster–Shafer Model. *IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.* https://doi.org/10.1109/jstars.2020.3043836
- **W3039535041** — Luca Fallati et al. (2020). Multi-Temporal UAV Data and Object-Based Image Analysis (OBIA) for Estimation of Substrate Changes in a Post-Bleaching Scenario on a Maldivian Reef. *Remote Sensing.* https://doi.org/10.3390/rs12132093
- **W3005439932** — Kyungjun Min et al. (2020). Ranking environmental degradation trends of plastic marine debris based on physical properties and molecular structure. *Nature Communications.* https://doi.org/10.1038/s41467-020-14538-z
- **W3008307945** — Sarah L. C. Giering et al. (2020). Sinking Organic Particles in the Ocean—Flux Estimates From in situ Optical Devices. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2019.00834
- **W3015547804** — Abdul‐Lateef Balogun et al. (2020). Spatio-Temporal Analysis of Oil Spill Impact and Recovery Pattern of Coastal Vegetation and Wetland Using Multispectral Satellite Landsat 8-OLI Imagery and Machine Learning Models. *Remote Sensing.* https://doi.org/10.3390/rs12071225
- **W3092266267** — David W. Walker et al. (2020). The benefits and negative impacts of citizen science applications to water as experienced by participants and communities. *Wiley Interdisciplinary Reviews Water.* https://doi.org/10.1002/wat2.1488
- **W3036298078** — Patrik Söderholm (2020). The green economy transition: the challenges of technological change for sustainability. *Sustainable Earth Reviews.* https://doi.org/10.1186/s42055-020-00029-y
- **W3103847652** — Jacopo Aguzzi et al. (2020). The potential of video imagery from worldwide cabled observatory networks to provide information supporting fish-stock and biodiversity assessment. *ICES Journal of Marine Science.* https://doi.org/10.1093/icesjms/fsaa169
- **W3014097559** — Patrizia Gazzola et al. (2020). Trends in the Fashion Industry. The Perception of Sustainability and Circular Economy: A Gender/Generation Quantitative Approach. *Sustainability.* https://doi.org/10.3390/su12072809
- **W3037625619** — Tadej Peršak et al. (2020). Vision-Based Sorting Systems for Transparent Plastic Granulate. *Applied Sciences.* https://doi.org/10.3390/app10124269
- **W2998576579** — Stephanie Wright et al. (2019). Atmospheric microplastic deposition in an urban environment and an evaluation of transport. *Environment International.* https://doi.org/10.1016/j.envint.2019.105411
- **W2970419732** — Joaquı́n Tintoré et al. (2019). Challenges for Sustained Observing and Forecasting Systems in the Mediterranean Sea. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2019.00568
- **W2961372703** — David Barbera‐Tomás et al. (2019). Energizing through Visuals: How Social Entrepreneurs Use Emotion-Symbolic Work for Social Change. *Academy of Management Journal.* https://doi.org/10.5465/amj.2017.1488
- **W2947746912** — Lisa A. Levin et al. (2019). Global Observing Needs in the Deep Ocean. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2019.00241
- **W2948539118** — C. Anela Choy et al. (2019). The vertical distribution and biological transport of marine microplastics across the epipelagic and mesopelagic water column. *Scientific Reports.* https://doi.org/10.1038/s41598-019-44117-2
- **W2959012558** — Nikolai Maximenko et al. (2019). Toward the Integrated Marine Debris Observing System. *Frontiers in Marine Science.* https://doi.org/10.3389/fmars.2019.00447
- **W2969918022** — Enrico Ciulli (2019). Tribology and Industry: From the Origins to 4.0. *Frontiers in Mechanical Engineering.* https://doi.org/10.3389/fmech.2019.00055


# Part II — LATAM gap analysis


# LATAM gap analysis — aquatic ML/CV microplastic detection (2019–2025)

**Phase:** 9.2 | **Date:** 2026-05-18  
**Corpus:** 228 identified · 31 obtained full text · **25** LATAM-flagged records ([`latam-papers.jsonl`](../corpus/structured/latam-papers.jsonl))  
**Companion:** [gap-list.md](../knowledge/gap-list.md) (GAP-GEO-*) · [slr.md](./slr.md) §4.1 · [questions.md](../questions.md) Q9

---

## 1. Executive summary

Latin American presence in this systematic map is **real but thin for operational aquatic monitoring**. Of **228** harvested works, **25** carry a LATAM-specific signal (`priority_latam` or named country/affiliation in metadata). Only **one** names a **LATAM field site** for MP-related ML (São Paulo urban beaches, W4408220111)—and its full text was **not obtained**. **Colombia** appears via **author affiliation** on one forward Raman study (W4392657594), with **no in-country aquatic field programme** in open evidence.

By contrast, **Global South field vision** studies in this corpus are concentrated in **South and Southeast Asia** (India W4291123479, Thailand W4391755619, Cambodia W3091414454, China W3204790372)—useful for method transfer but **not LATAM-validated**.

The dominant deployment path implied by **obtained** full texts remains **ex-situ spectroscopy hubs** (μFTIR W4200249418, µ-Raman W4366815281) with high capex—misaligned with many municipal LATAM laboratory budgets (GAP-EDGE-02). A credible regional strategy is **tiered**: litter/macro surveillance where acceptable, periodic spectroscopic confirmation on subsets, and **mandatory local pilots** before procurement—not extrapolation from affiliation flags or abstract accuracy alone.

---

## 2. Scope and method

| Source | Use in this report |
| --- | --- |
| [`latam-papers.jsonl`](../corpus/structured/latam-papers.jsonl) | Authoritative LATAM subset (Phase 9.1 scan) |
| [`papers.jsonl`](../corpus/structured/papers.jsonl) | Full corpus metadata |
| [`extraction.csv`](../corpus/structured/extraction.csv) | Modality, `global_south`, metrics |
| [`latam-scan-9.1-log.md`](../corpus/structured/latam-scan-9.1-log.md) | Scan rules and counts |
| Obtained summaries (`corpus/summaries/`) | Verified performance for non-LATAM primaries cited for transfer |

**Inclusion in `latam-papers.jsonl`:** at least one of `priority_latam`, OpenAlex LATAM country code, LATAM affiliation text, or LATAM place mention—not generic “Global South” extraction alone.

---

## 3. Geographic inventory

### 3.1 Country coverage (OpenAlex author countries among LATAM set)

| ISO | n | Representative `paper_id` (aquatic MP ML relevance) |
| --- | ---: | --- |
| **BR** | 8 | W4408220111 (field beach MP+RS+ML, forward); W3196128465 (ocean MP spectroscopy ML, forward) |
| **MX** | 5 | W4408550134 (FTIR polymer ML, forward); W4360603582 (tangential) |
| **CL** | 3 | W3214223885 (μFTIR R package); W2959012558 (marine debris observing, review) |
| **CO** | 2 | W4392657594 (Raman ML, **affiliation only**); W2959012558 (multi-country debris review) |
| **PR** | 1 | W4360603582 (author country only) |

**Not represented** in the LATAM subset with aquatic MP ML primaries: Argentina, Peru, Ecuador, Venezuela, Central America (except metadata noise), **Magdalena/Caribbean Colombia river systems** as study sites.

### 3.2 Field site classification (`field_site_class`)

| Class | n | Meaning |
| --- | ---: | --- |
| `field` | **1** | Explicit LATAM environmental sampling geography (W4408220111, Brazil coast) |
| `affiliation_only` | **1** | CO authors; no aquatic field site in available text (W4392657594) |
| `affiliation_or_metadata_only` | **23** | LATAM author country or `priority_latam` without verified LATAM aquatic field campaign |
| `mentioned_in_text` | 0 | — |

### 3.3 Full-text access within LATAM set

| Status | n | Notes |
| --- | ---: | --- |
| Obtained FT | **1** | W3003736709 (SMACC, Spain/Italy lab beach sediment—not a LATAM field programme; likely `priority_latam` false positive) |
| Forward, FT not retrieved | **24** | Includes W4408220111 (strongest LATAM **field** signal) and W4392657594 (Colombia) |

**Implication (GAP-CORPUS-01):** LATAM performance and deployment claims cannot be verified from open full text except indirectly through non-LATAM obtained primaries.

---

## 4. Colombia and Andean gap (GAP-GEO-01)

### 4.1 What the corpus contains

| `paper_id` | Signal | Aquatic MP ML? | Field in Colombia? | FT |
| --- | --- | --- | --- | --- |
| W4392657594 | CO affiliation; Raman noise/ML classification | Yes (polymer ID topic) | **No** — laboratory Raman study | No |
| W2959012558 | CO among author countries | No — marine **debris observing** systems review | No MP ML pipeline | No |

No other row in `latam-papers.jsonl` lists **CO** in `latam_countries`.

### 4.2 What is absent

- Obtained-full-text **river, estuary, or Caribbean coastal** campaigns with ML/CV MP detection in Colombia.  
- Validation on **Magdalena**, Andean tributaries, or Pacific/Atlantic Colombian waters.  
- Municipal or national monitoring procurement evidence tied to Colombian matrices (turbidity, biofouling, polymer mix).

This confirms [questions.md](../questions.md) **Q9**: any Colombia monitoring architecture today is **transfer-based**, not literature-validated locally.

### 4.3 Defensible transfer candidates (explicit risk)

| Layer | Non-Colombia evidence | Transfer risk to CO waters |
| --- | --- | --- |
| Lab μFTIR / µ-Raman hub | W4200249418, W4366815281, W4382931577 | Capex, prep labour; matrix chemistry unproven |
| Field UV MP boxes | W4391755619 (Thailand) | Turbidity, UV fouling, small-object mAP |
| Macro litter river CV | W4291123479 (India) | Task mismatch (solid waste ≠ MPs) |
| Brazil beach RS+ML | W4408220111 (abstract) | **Unverified** metrics; sandy beach ≠ Andean river |
| Tiered litter + lab ID | slr.md §4.1.8 synthesis | Policy-acceptable if KPIs separated |

**None** substitute for a **Colombia pilot** with reported precision, recall/mAP, and polymer confirmation on local samples.

---

## 5. Brazil and Mexico signals

### 5.1 Brazil — strongest LATAM field mention, weakest verification

**W4408220111** integrates GNSS, remote sensing, µ-Raman, and ML (RF/Gradient Boosting) for **urban sandy beach MP deposition** (São Paulo). Abstract cites **6–35 particles/m²** and model accuracy; extraction marks metrics `unverified` and **no obtained FT** (GAP-GEO-03).

Other Brazil-tagged rows are **affiliation or topic-adjacent**, not aquatic MP field ML:

- W3196128465 — vibrational spectroscopy ML for ocean MPs (forward).  
- W3157156599 — HSI-NIR microplastics (forward; paywalled per manifest).  
- W4387140887 — marine ecology ML overview (not MP-specific).

### 5.2 Mexico — author presence, limited aquatic MP ML FT

| `paper_id` | Topic | FT |
| --- | --- | --- |
| W4408550134 | FTIR polymer classification (normalization/ML comparison) | Forward |
| W4360603582 | Airborne dust health review (tangential) | Forward |
| W4404968855 | Agriculture drought ML (excluded scope) | Forward |

No obtained FT with **Mexican river/coast/WWTP** MP CV or spectroscopy field campaign in this map.

---

## 6. Global South vs LATAM confusion (GAP-GEO-02)

| Metric | Count |
| --- | ---: |
| `extraction.csv` rows `global_south=yes` | **12** |
| LATAM rows in `latam-papers.jsonl` | **25** |
| Overlap: LATAM + GS extraction | **4** (W4392657594, W4408220111, W4408550134, W3196128465) |
| GS field vision primaries (non-LATAM geography) | **4** (W4291123479, W4391755619, W3091414454, W3204790372) |

**Lesson:** procurement and bibliometrics must use **field geography** and `latam-papers.jsonl`, not `global_south=yes` alone. India/Thailand/Cambodia/China papers support **method feasibility** for GS contexts, not LATAM error rates.

---

## 7. Modality and cost gaps for LATAM deployment

| Gap ID | Issue | LATAM relevance | Evidence |
| --- | --- | --- | --- |
| GAP-EDGE-02 | High capex spectroscopy dominates validated ID | Municipal labs, shared regional hubs | W4200249418, W4366815281, W4296114416 |
| GAP-EDGE-01 | “Low-cost edge” rarely has open FT | Phone/YOLO abstracts unverified | W4409162823 forward; W4391319604 only verified edge primary (non-LATAM) |
| GAP-FIELD-01 | In-situ aquatic MP sensing review-heavy | Continuous monitoring narrative | W4396828529 forward; W4404688861 forward |
| GAP-SCALE-01 | Macro litter CV ≠ MP monitoring | River litter programmes | W4291123479 vs W4391755619 |
| GAP-METHOD-01 | No shared aquatic MP CV benchmark | Cannot compare pilots across countries | vs W4205835860 MARIDA (debris) |

---

## 8. Screening and harvest bias affecting LATAM evidence

- **English OpenAlex queries** may under-sample Spanish/Portuguese-primary literature (protocol limitation; see slr.md §5.1).  
- **Open-access FT policy** excludes paywalled LATAM venue papers (e.g. W2936115560 fish HSI cited in reviews).  
- **25/228** LATAM flags ≈ **11%** of corpus—similar order to `priority_latam` harvest statistic—not negligible, but **field-verified aquatic ML** is ≈ **0%** of obtained FT.

---

## 9. Suggested research and policy directions (evidence-aligned)

1. **Colombia / Andean pilot** — UV or microfluidic screening (W4391755619, W4383534319 methods) with µ-Raman/μFTIR confirmation (W4366815281, W4200249418) on **local river/coast matrices**; publish precision **and** mAP/recall (GAP-GEO-01).  
2. **Retrieve and verify W4408220111** — strongest existing LATAM **field** RS+ML signal (GAP-GEO-03; discovery CORPUS backlog).  
3. **Open benchmark** — aquatic MP in-water images with LATAM sites, shared labels, COCO-style metrics (GAP-METHOD-01).  
4. **Spanish/Portuguese supplementary search** — SciELO, Redalyc, institutional repositories (not in pilot harvest).  
5. **Separate KPIs** — macro litter drone/satellite (W3091414454, W4380082849) vs polymer ID (W4200249418)—do not merge in regulation (GAP-SCALE-01).

---

## 10. Answers to open questions (partial)

| Question | Status after 9.2 |
| --- | --- |
| **Q9** Colombia architecture without local validation | **Open** — tiered transfer only; see §4.3 |
| **Q10** Tiered edge + spectroscopy | **Supported as synthesis proposal**, not validated stack (W4391319604 + W4200249418 pattern) |

---

## 11. Source files

- LATAM scan: [`latam-papers.jsonl`](../corpus/structured/latam-papers.jsonl), [`latam-scan-9.1-log.md`](../corpus/structured/latam-scan-9.1-log.md)  
- Gaps: [`gap-list.md`](../knowledge/gap-list.md) §1, §7  
- Claims: [`claims.jsonl`](../knowledge/claims.jsonl) (Colombia/Brazil batches)  
- Manuscript: [`slr.md`](./slr.md) §4.1.6, §5.6  

**Next (spine):** §9.3–9.4 integrate Magdalena/Caribbean and low-cost sections into `slr.md`; §9.5–9.7 transferable methods lists.


# Part III — Colombia transferable methods


# Transferable methods for Colombia aquatic MP monitoring (actionable)

**Phase:** 9.6 | **Date:** 2026-05-18  
**Audience:** Programme designers (Magdalena basin, Caribbean coast, municipal utilities)  
**Evidence base:** Obtained full texts + structured extraction in [systematic map](./slr.md); **no Colombia field validation** in corpus ([latam-gap-analysis.md](./latam-gap-analysis.md) §4).

**Use rule:** Treat every row below as a **transfer hypothesis** until replicated on Colombian matrices with dual metrics (detection rate + polymer confirmation).

---

## Recommended tiered architecture (Colombia)

```text
Tier 0 — Prevention / source (policy)     → textile washing, industrial discharge controls
Tier 1 — Spatial litter surveillance      → drone / satellite macro debris (separate KPI)
Tier 2 — Field screening (alert)          → UV or RGB detectors; report mAP + precision
Tier 3 — Laboratory confirmation (hub)    → µ-Raman / μFTIR on subsets
Tier 4 — Research pilots only             → microfluidics, holographic flow, AI flume cameras
```

Do **not** procure a single “AI microplastic sensor” claiming Colombia-validated performance—none exists in open evidence (Section 6, [slr.md](./slr.md)).

---

## Actionable method cards (transferable with conditions)

### T1 — Shared spectroscopy hub (µ-Raman + μFTIR)

| Field | Detail |
| --- | --- |
| **Evidence** | W4200249418 (μFTIR RDF accuracy 0.9766, κ 0.9690); W4366815281 (precision ≥97.1%, recall ≥99.4%, five polymers); W4382931577 (Raman 98.8% lab) |
| **Fit for Colombia** | National or Andean **regional lab** serving Magdalena cities, Caribbean ports, utilities |
| **Pilot actions** | (1) Select 3 matrices: Magdalena surface water, estuary sediment, WWTP effluent. (2) Harmonise prep with W4200249418 dual-control QA. (3) Report **hit rate** (% samples with polymer ID), not accuracy alone. |
| **Budget** | High capex (GAP-EDGE-02); partner university or INOCAR-style institute |
| **Transfer risk** | **Medium** — chemistry/turbidity unproven; workflow transferable |
| **Priority** | **P1** if polymer ID is legally required |

### T2 — UV imagery + Faster R-CNN polymer screening (field alert)

| Field | Detail |
| --- | --- |
| **Evidence** | W4391755619 (Thailand coastal UV; precision 85.5–87.8%; **mAP 33.9–35.7%**) |
| **Fit for Colombia** | **Caribbean** coastal audits, fixed UV rigs on piers or small boats—not sole compliance metric |
| **Pilot actions** | (1) Replicate UV capture protocol on 500+ labelled boxes in local water. (2) Publish **mAP and precision** separately. (3) Define alert threshold for “send to lab” triggers. |
| **Budget** | Medium (UV hardware + GPU); lower than full Raman hub |
| **Transfer risk** | **Medium–high** — biofouling, turbidity, different polymer mix |
| **Priority** | **P2** for coastal screening; **not** Magdalena turbid river without adaptation |

### T3 — Riverine macro litter YOLO (litter management KPI)

| Field | Detail |
| --- | --- |
| **Evidence** | W4291123479 (India urban river; mAP 89%, macro solid waste) |
| **Fit for Colombia** | Magdalena **floating waste** management, flood debris—**not** MP polymer compliance |
| **Pilot actions** | (1) Label macro waste classes on Magdalena drone/bank video. (2) Do **not** report as “MP detection.” (3) Pair with Tier 1 policy for flood seasons. |
| **Budget** | Low–medium (RGB cameras, YOLO training) |
| **Transfer risk** | **Low** for litter task; **high** if mislabeled as MP monitoring |
| **Priority** | **P2** for integrated river management; separate budget line from MP ID |

### T4 — Drone orthomosaic litter mapping (coastal / beach)

| Field | Detail |
| --- | --- |
| **Evidence** | W3091414454 (Cambodia; accuracy up to 83% macro litter) |
| **Fit for Colombia** | Caribbean **beach/nearshore** litter surveys; complements W4408220111 (Brazil, unverified) |
| **Pilot actions** | (1) Standardise flight height and tide state. (2) Ground-truth 200+ patches. (3) Report scenario label (plastic vs all litter). |
| **Budget** | Medium (drone + labelling labour) |
| **Transfer risk** | **Medium** — coast type differs; method transferable |
| **Priority** | **P2** Caribbean municipalities |

### T5 — Sentinel-2 floating debris (regional surveillance)

| Field | Detail |
| --- | --- |
| **Evidence** | W4380082849 (scenario accuracy 98% / 83% / 75%); W4205835860 (MARIDA debris IoU) |
| **Fit for Colombia** | **Large-scale** litter/debris patches on Magdalena mouth / Caribbean—**not** sub-mm MPs |
| **Pilot actions** | (1) Use scenario definitions from W4380082849 explicitly. (2) Validate with boat surveys quarterly. (3) Avoid MP regulatory language. |
| **Budget** | Low marginal (archive imagery + compute) |
| **Transfer risk** | **Low** for macro debris mapping |
| **Priority** | **P3** national-scale screening |

### T6 — WWTP / sludge spectroscopy (utility partnership)

| Field | Detail |
| --- | --- |
| **Evidence** | W4200249418 (influent/effluent/sludge matrices); W4366815281 (catchment/WWTP); W4296114416 (removal heterogeneity review) |
| **Fit for Colombia** | Major cities on Magdalena (Bogotá region utilities, coastal WWTPs) |
| **Pilot actions** | (1) Monthly sludge/effluent subsample to hub. (2) Track polymer classes vs removal plant type. (3) Do not use removal % as discharge MP proxy alone. |
| **Budget** | Hub + sampling labour |
| **Transfer risk** | **Medium** — plant-specific |
| **Priority** | **P2** if wastewater MP tracking is mandated |

### T7 — Po River–style HSI concentration mapping (research grade)

| Field | Detail |
| --- | --- |
| **Evidence** | W4213300830 (1.89–8.22 particles/m³; HSI + HI-PLS-DA) |
| **Fit for Colombia** | **Research expedition** on Magdalena transects—not routine municipal |
| **Pilot actions** | (1) One wet-season campaign with shared protocol. (2) Publish concentration maps with instrument cost disclosure. |
| **Budget** | **Very high** (SWIR HSI) |
| **Transfer risk** | **High** — European river transfer |
| **Priority** | **P3** research grants only |

### T8 — Microfluidic lab screening (small-n field demo)

| Field | Detail |
| --- | --- |
| **Evidence** | W4383534319 (>93% lab; **n=5** field particles) |
| **Fit for Colombia** | Mobile lab add-on after method development |
| **Pilot actions** | (1) Replicate lab accuracy on Magdalena samples. (2) Target **≥100** field particles before operational claims. |
| **Budget** | Medium (microfluidic fabrication skill) |
| **Transfer risk** | **High** until field n scaled |
| **Priority** | **P3** R&D |

### T9 — High-throughput polarization holographic flow (lab centre)

| Field | Detail |
| --- | --- |
| **Evidence** | W4391319604 (accuracy up to 96%; flow-through lab) |
| **Fit for Colombia** | Regional **screening centre** after capital investment |
| **Pilot actions** | (1) Costing workshop with operators. (2) Compare throughput vs SMACC microscopy (W3003736709). |
| **Budget** | High specialised optics |
| **Transfer risk** | **Medium** — lab-only in source |
| **Priority** | **P3** |

### T10 — Retrieve Brazil beach RS+ML protocol (LATAM field precedent)

| Field | Detail |
| --- | --- |
| **Evidence** | W4408220111 (São Paulo coast; RF/GB + GNSS + µ-Raman; forward FT) |
| **Fit for Colombia** | Caribbean **sandy beach** deposition models—closest LATAM **field** ML signal |
| **Pilot actions** | (1) Obtain full text (CORPUS backlog). (2) Compare beach type to Cartagena/Santa Marta. (3) Adapt sampling before ML procurement. |
| **Budget** | Low (literature) + pilot replication cost TBD |
| **Transfer risk** | **High** until verified |
| **Priority** | **P1** literature step before Caribbean ML procurement |

---

## Suggested 18-month pilot sequence (Magdalena + Caribbean)

| Phase | Months | Activity | Methods | Success metric |
| --- | --- | --- | --- | --- |
| **A — Design** | 1–3 | Stakeholder KPIs; separate litter vs MP | T3, T5 | Signed metric dictionary |
| **B — Hub setup** | 3–9 | Establish Raman/FTIR hub; SOP from W4200249418 | T1, T6 | QA pass on reference materials |
| **C — Coastal screen** | 6–12 | UV pilot (Caribbean) + drone litter | T2, T4 | mAP + precision on local data |
| **D — River litter** | 6–12 | YOLO macro waste Magdalena | T3 | mAP on floating waste class |
| **E — Evaluation** | 12–18 | Publish Colombia dataset; policy brief | All | Open labels + polymer confirmation subset |

---

## Minimum reporting checklist (any Colombia pilot)

- [ ] Matrix (river, estuary, beach, WWTP) and season  
- [ ] MP size class or explicit macro-litter proxy  
- [ ] **Detection metric** (mAP, recall, or count error) **and** **confirmation metric** (spectral accuracy on subset)  
- [ ] Sample **n** and polymer prevalence  
- [ ] Instrument capex and $/sample  
- [ ] Comparison to non-Colombia source `paper_id` (transfer transparency)

---

## Related files

- [latam-gap-analysis.md](./latam-gap-analysis.md)  
- [gs-oecd-comparison-9.5.md](./gs-oecd-comparison-9.5.md)  
- [slr.md](./slr.md) §4.1.8, §6, §7  
- [questions.md](../questions.md) Q9 (partially addressed by pilot sequence above)

**Next (spine):** 9.7 — non-transferable methods and why.


# Part IV — Colombia non-transferable / risky transfers


# Non-transferable methods for Colombia aquatic MP monitoring (and why)

**Phase:** 9.7 | **Date:** 2026-05-18  
**Companion:** [colombia-transferable-methods.md](./colombia-transferable-methods.md) (9.6)  
**Evidence:** Obtained full texts, extraction rows, and gap IDs in [gap-list.md](../knowledge/gap-list.md); [slr.md](./slr.md) §4.1, §6–7.

**Use rule:** If a method appears below, do **not** deploy it in Colombia as **validated MP monitoring** without a local pilot that fixes the stated failure mode. Some rows support **adjacent** programmes (litter, R&D) when KPIs are relabelled—see “Allowed if relabelled.”

---

## Decision matrix (quick)

| Failure mode | Gap ID | Colombia risk |
| --- | --- | --- |
| Wrong KPI (litter/debris as MP) | GAP-SCALE-01 | False regulatory compliance |
| Wrong matrix (consumer, flume, indoor) | GAP-EDGE-01, GAP-FIELD-01 | Wrong instrument procurement |
| Unverified abstract metrics | GAP-METHOD-03, GAP-CORPUS-01 | Budget on hype |
| Geography / affiliation mismatch | GAP-GEO-01, GAP-GEO-02 | Policy cites non-Colombia evidence |
| Lab accuracy sold as field ops | GAP-FIELD-02, GAP-ENV-01 | Underestimated OPEX and error |

---

## Non-transferable method cards (NT1–NT12)

### NT1 — MARIDA / marine-debris segmentation IoU as “MP detector”

| Field | Detail |
| --- | --- |
| **Evidence** | W4205835860 (MARIDA; marine debris masks; IoU-oriented CV) |
| **Why not transferable** | Task is **floating macro debris**, not polymer-identified microplastics in water. IoU on debris patches ≠ MP concentration or polymer compliance. |
| **Colombia misuse** | Procurement of “satellite MP AI” for Magdalena/Caribbean using MARIDA weights without relabelling KPI. |
| **Allowed if relabelled** | **Tier 1 litter surveillance** only (see T5 in transferable doc). |
| **Gap** | GAP-SCALE-01, GAP-METHOD-01 |

### NT2 — River macro-litter YOLO mAP as MP monitoring metric

| Field | Detail |
| --- | --- |
| **Evidence** | W4291123479 (India urban river; **89% mAP** solid waste, not MPs) |
| **Why not transferable** | Object classes are **bottles, bags, macro waste**—not sub-mm MP particles or polymer ID. High mAP does not bound MP false negatives. |
| **Colombia misuse** | Magdalena flood programme reports “89% MP detection” from litter CV. |
| **Allowed if relabelled** | **Floating waste management** KPI separate from MP regulation (T3). |
| **Gap** | GAP-SCALE-01 |

### NT3 — Drone / Sentinel-2 litter accuracy as aquatic MP compliance

| Field | Detail |
| --- | --- |
| **Evidence** | W3091414454 (Cambodia drone, up to **83%** macro litter); W4380082849 (Sentinel-2 scenarios **98% / 83% / 75%**) |
| **Why not transferable** | Resolution and labels target **visible litter/debris scenarios**, not in-water MP counts or polymers. |
| **Colombia misuse** | Caribbean coastal MP limits inferred from debris-classifier accuracy alone. |
| **Allowed if relabelled** | Regional **debris patch** mapping with boat validation (T4, T5). |
| **Gap** | GAP-SCALE-01 |

### NT4 — Phone-microscope YOLO ($10 scope) for river or coastal MP compliance

| Field | Detail |
| --- | --- |
| **Evidence** | W4409162823 (consumer-product on-site extraction; **no obtained FT**; matrix not aquatic field) |
| **Why not transferable** | Matrix is **consumer products** after manual extraction—not continuous river/coastal sensing. Metrics **unverified** in corpus. |
| **Colombia misuse** | Municipal “low-cost MP network” built on phone kits without aquatic validation. |
| **Allowed if relabelled** | Consumer-product screening R&D only, after FT retrieval and metric audit. |
| **Gap** | GAP-EDGE-01, GAP-METHOD-03 |

### NT5 — Real-time AI camera (lab flume) as in-pipe or in-river sensor

| Field | Detail |
| --- | --- |
| **Evidence** | W4400418758 (YOLOv5 + DeepSORT; **laboratory flume**; forward, no OA PDF) |
| **Why not transferable** | Controlled **flume motion** ≠ turbid Magdalena or Caribbean field optics; performance **unverified**. |
| **Colombia misuse** | WWTP or river intake installs marketed as “real-time MP AI camera” from abstract claims. |
| **Allowed if relabelled** | University flume replication study before any utility pilot. |
| **Gap** | GAP-FIELD-03, GAP-EDGE-01 |

### NT6 — UV Faster R-CNN precision alone (without mAP) for regulatory sign-off

| Field | Detail |
| --- | --- |
| **Evidence** | W4391755619 (Thailand UV; precision **85.5–87.8%** but **mAP 33.9–35.7%**) |
| **Why not transferable** | **Precision–mAP split** shows small-object/localisation weakness; Thailand coastal matrix ≠ Magdalena turbidity or Caribbean fouling without replication. |
| **Colombia misuse** | Compliance based on precision headline while missing particles drive false negatives. |
| **Allowed if relabelled** | **Alert-tier** screening after local mAP study (T2)—not sole legal metric. |
| **Gap** | GAP-METHOD-02, GAP-FIELD-01 |

### NT7 — Microfluidic CNN “>93%” as operational field monitor

| Field | Detail |
| --- | --- |
| **Evidence** | W4383534319 (lab **>93%**; field demo **n = 5** particles) |
| **Why not transferable** | Field evidence is **proof-of-concept scale**, not operational monitoring statistics. |
| **Colombia misuse** | Nationwide rollout from lab accuracy without ≥100-particle field campaign. |
| **Allowed if relabelled** | R&D pilot with scaled field n (T8). |
| **Gap** | GAP-FIELD-02 |

### NT8 — Review-cited “>97% μFTIR ML” aggregates as Colombia primary benchmark

| Field | Detail |
| --- | --- |
| **Evidence** | W4393943493 (aged-MP review; **cited literature aggregates**, not primary experiments) |
| **Why not transferable** | Numbers are **secondary citations** across heterogeneous studies; aged/tropical weathering breaks library-match assumptions cited in review. |
| **Colombia misuse** | Tender specs cite review aggregates as if measured on Colombian matrices. |
| **Allowed if relabelled** | Hypothesis for **hub design**; must replicate on local sludge/water (T1, T6) with primary reporting. |
| **Gap** | GAP-ENV-01, GAP-METHOD-03 |

### NT9 — HSI shape taxonomy (11,042 particles) as turnkey aquatic field ML

| Field | Detail |
| --- | --- |
| **Evidence** | W4385737119, W4414305742 (forwards; shape harmonisation; mixed matrices incl. wastewater fractions) |
| **Why not transferable** | No obtained-FT **aquatic field deployment** metrics; indoor/air and lab imaging dominate evidence; **no open aquatic MP CV benchmark** linked (GAP-METHOD-01). |
| **Colombia misuse** | “Replace expert labelling nationwide” without field CV validation on Magdalena/Caribbean water. |
| **Allowed if relabelled** | Lab comparability and training-data standardisation inside a hub (T1)—not stand-alone field sensor. |
| **Gap** | GAP-SCALE-03 |

### NT10 — Brazil beach RS+ML accuracy (abstract-only) for Caribbean procurement

| Field | Detail |
| --- | --- |
| **Evidence** | W4408220111 (São Paulo sandy beach; RF/GB + GNSS + µ-Raman; **forward FT**; extraction **unverified**) |
| **Why not transferable** | Strongest LATAM **field** signal is **not verified** in open corpus; sandy urban beach ≠ Andean river or turbid estuary without study. |
| **Colombia misuse** | Cartagena/Santa Marta tenders cite abstract accuracy and deposition rates (6–35 particles/m²) as validated. |
| **Allowed if relabelled** | Literature review step only until CORPUS retrieval + local beach pilot (T10 transferable). |
| **Gap** | GAP-GEO-03 |

### NT11 — Colombia affiliation paper (W4392657594) as national field programme evidence

| Field | Detail |
| --- | --- |
| **Evidence** | W4392657594 (Colombian affiliation; **no obtained FT**; not aquatic MP field CV in map) |
| **Why not transferable** | **Authorship ≠ in-country field validation** (GAP-GEO-02). No Magdalena/Caribbean campaign in corpus. |
| **Colombia misuse** | Policy cites “Colombian ML MP study” from affiliation metadata alone. |
| **Allowed if relabelled** | None for monitoring claims—use only after dedicated Colombia pilot publishes field metrics. |
| **Gap** | GAP-GEO-01 |

### NT12 — `global_south=yes` extraction flag as LATAM/Colombia deployment proof

| Field | Detail |
| --- | --- |
| **Evidence** | **12** rows `global_south=yes`; **4** GS field-vision primaries (IN/TH/KH/CN); **4** LATAM∩GS overlap in [gs-oecd-comparison-9.5.md](./gs-oecd-comparison-9.5.md) |
| **Why not transferable** | Flag captures **affiliation or author country**, not field geography or Colombia matrices. India/Thailand methods do not imply Colombian error rates. |
| **Colombia misuse** | Bibliometric dashboards rank “GS-ready” papers that never sampled Colombian water. |
| **Allowed if relabelled** | Use `latam-papers.jsonl` + **field site** columns in extraction for procurement filters. |
| **Gap** | GAP-GEO-02 |

---

## Additional “do not procure as-is” patterns

| Pattern | Example `paper_id` | Why |
| --- | --- | --- |
| **Po River HSI routine municipal** | W4213300830 | European river; **very high** SWIR capex—not LATAM municipal default (T7 transferable = research only). |
| **Holographic flow “low-cost edge” without OPEX model** | W4391319604 | Verified lab primary but **specialised optics**; not phone-edge narrative (GAP-EDGE-02). |
| **In-situ sensor review narrative → product** | W4396828529, W4404688861 (forwards) | Reviews describe **need** for in-situ tools; corpus lacks open FT product validation (GAP-FIELD-01). |
| **Paywalled LATAM venue excluded by OA policy** | e.g. W2936115560 (cited in reviews) | Evidence gap, not proof methods fail—**do not** fill gap with unverified OECD papers instead. |
| **Excluded scope (agriculture, indoor air only)** | W4404968855, indoor-only fractions of HSI forwards | Outside aquatic MP monitoring mandate per [protocol.md](../../protocol.md). |

---

## Procurement red flags (Colombia)

1. Vendor cites **accuracy** without matrix (river, estuary, beach, WWTP) and **size class**.  
2. Single metric (accuracy or precision) **without** mAP/recall or spectral confirmation rate.  
3. **“AI microplastic sensor”** with no open full text and no Colombian pilot data.  
4. **Litter/debris** CV marketed as MP polymer monitoring (NT1–NT3).  
5. **Review aggregates** (NT8) or **affiliation** (NT11) substituted for field trials.  
6. **Global South** label without site coordinates and season (NT12).

---

## Mapping to transferable methods (9.6)

| Non-transferable | Use instead (if any) |
| --- | --- |
| NT1–NT3 litter/debris KPI | T3, T4, T5 with **separate** litter KPI |
| NT4–NT5 edge hype | T2 field UV + T1 lab hub; verify FT first |
| NT6 precision-only UV | T2 with dual metrics |
| NT7 microfluidic lab % | T8 scaled pilot |
| NT8 review >97% | T1 primary replication |
| NT9 HSI taxonomy alone | T1 hub + future benchmark |
| NT10 Brazil abstract | T10 retrieve + pilot |
| NT11 affiliation CO | New Colombia pilot (latam-gap §9.1) |

---

## Related files

- [colombia-transferable-methods.md](./colombia-transferable-methods.md)  
- [latam-gap-analysis.md](./latam-gap-analysis.md)  
- [slr.md](./slr.md) §4.1.2–4.1.8, §6, §7  
- [questions.md](../../questions.md) Q9, Q10  

**Next (spine):** 9.8 — LATAM-specific claims in `claims.jsonl`.


# Part V — Research gaps


# Research gaps — microplastics ML/CV aquatic detection (2019–2025)

**Phase:** 7.8 | **Date:** 2026-05-18  
**Corpus:** 228 identified · 116 forward · 31 included FT · 28 primary extraction rows · 61 claims  
**Emphasis:** Deployment in resource-limited and Global South settings (incl. Colombia relevance in Phase 9).

Evidence tags: **P** = primary (`is_review: false`), **R** = review/synthesis, **F** = forward abstract-only / FT not obtained.

---

## Priority legend

| Priority | Meaning |
| --- | --- |
| **P1** | Blocks credible field monitoring in LATAM / low-resource contexts |
| **P2** | Limits synthesis quality or transferability |
| **P3** | Important but partially addressed in corpus |

---

## 1. Geographic and institutional gaps

### GAP-GEO-01 — No Colombia (or broader LATAM) field-validated aquatic ML/CV pipeline

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | No obtained-full-text primary reports an **in-country field program** for ML/CV microplastic detection in Colombian waters. |
| **Evidence** | W4392657594 (CO affiliation, abstract-only); claims batch C; Phase 6.10 audit |
| **Implication** | Policy and monitoring pilots cannot cite local validation; transfer must assume extrapolation from India/Thailand/Cambodia/China/Brazil (abstract). |

### GAP-GEO-02 — Global South flags ≠ field deployment evidence

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | **12** extraction rows `global_south=yes` but only **four** primaries combine GS **field geography + trained vision** (India river YOLO, Thailand UV Faster R-CNN, Cambodia drone, China underwater Mask R-CNN). |
| **Evidence** | W4291123479, W4391755619, W3091414454, W3204790372; manifest `extraction_global_south_yes=12` |
| **Implication** | Authorship or affiliation-based tagging overstates operational Global South monitoring capacity. |

### GAP-GEO-03 — Brazil beach RS+ML not yet verified in corpus

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | São Paulo urban beach MP deposition study (RF/GB + remote sensing) is **forward abstract-only** — strongest LATAM **field** signal unverified. |
| **Evidence** | W4408220111 (F) |

---

## 2. In-situ and field aquatic monitoring

### GAP-FIELD-01 — In-situ aquatic MP detection remains review-heavy, prototype-light

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | Systematic reviews stress need for **in-situ** methods (standardisation, cost, coverage); corpus primaries are predominantly **lab extraction + spectroscopy** or **macro litter CV**. |
| **Evidence** | W4396828529 (R,F); W4404688861 (F); W3172017684 prototype (P); contradictions §4 |
| **Implication** | Deployment gap between “lab-validated ID” and “operational river/coastal sensor.” |

### GAP-FIELD-02 — Weak field replication for promising lab CV

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | Microfluidic seawater CNN **>93%** accuracy but field demo **n = 5** particles. |
| **Evidence** | W4383534319 (P) |

### GAP-FIELD-03 — Real-time edge AI camera unverified at full text

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | YOLOv5 + DeepSORT “real-time MP” AI camera (lab flume) lacks obtained FT and verified metrics. |
| **Evidence** | W4400418758 (F); edge_low_cost=yes |

---

## 3. Low-cost and edge deployment

### GAP-EDGE-01 — Edge/low-cost claims rarely backed by obtainable full text

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | **4** extraction rows `edge_low_cost=yes`; only **W4391319604** has OA full text — others abstract-only or embeddable macro-litter (W3204790372, W4391755619, W4400418758). |
| **Evidence** | W4391319604, W4391755619, W3204790372, W4400418758; claims batch C |
| **Implication** | “Affordable monitoring” narrative exceeds verified open evidence in this corpus. |

### GAP-EDGE-02 — Capital equipment dominates validated MP ID

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | Reviews and primaries assume **µ-Raman, μFTIR, HSI, O-PTIR, deep UV Raman** — high capex for municipal/LATAM labs. |
| **Evidence** | W4296114416 (R), W4318615471 (R), W4366815281, W4200249418, W4308496878 (P); contradictions §5 |
| **Implication** | Phone-microscope YOLO (W4409162823) and similar remain **consumer/lab** matrices, not aquatic field stacks. |

---

## 4. Scale and task-definition gaps

### GAP-SCALE-01 — Macro litter CV dominates vision literature; MP-specific field CV is sparse

| | |
| --- | --- |
| **Priority** | P1 |
| **Gap** | Five **rgb_object_detection** modality groups vs few MP-sized field detectors; satellite/AUV papers target **debris**, not polymer-level MPs in water. |
| **Evidence** | W4205835860, W4380082849, W4321194910, W4291123479, W3091414454 (P); modality-map |
| **Implication** | Transferring litter-mAP to MP monitoring policy is unsafe (metrics-legend §7). |

### GAP-SCALE-02 — Sub-mm MP weak in HSI waste/sorting literature

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | HSI plastic-waste review cites **&lt;80%** precision below sub-mm in cited studies. |
| **Evidence** | W4318615471 (R) |

### GAP-SCALE-03 — Harmonised HSI shape taxonomy not yet paired with open aquatic field benchmarks

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Shape classes validated on 11,042 particles (W4385737119) and DL shape classifiers (W4414305742) lack linked **aquatic field ML monitoring** datasets in corpus. |
| **Evidence** | W4385737119, W4414305742 (F) |

---

## 5. Methods, data, and comparability

### GAP-METHOD-01 — No cross-study benchmark for aquatic MP CV

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Unlike marine debris (**MARIDA**, W4205835860), no shared **microplastic-in-water** CV benchmark dataset spans labs and LATAM sites. |
| **Evidence** | W4205835860 vs MP primaries; W4385411640 metrology primer |

### GAP-METHOD-02 — Metric incomparability across papers

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Corpus mixes spectral **accuracy**, detector **mAP**, segmentation **mIoU**, and **count error** without unified reporting. |
| **Evidence** | metrics-legend.md; W4391755619 precision vs mAP; contradictions §7 |

### GAP-METHOD-03 — `unverified` and abstract-only metrics still common

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | **8** abstract-only extraction rows; Phase 6.12 audit tagged multiple `unverified` fields. |
| **Evidence** | extraction.csv; extraction-unverified-6.12-log.md |

### GAP-METHOD-04 — Automation narrative vs expert-in-the-loop practice

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Reviews cite automation bottlenecks; μFTIR RDF pipeline retains **expert dual-control** and commercial software. |
| **Evidence** | W4313826580 (R), W4200249418 (P); contradictions §9 |

---

## 6. Environmental realism and matrix coverage

### GAP-ENV-01 — Weathered / aged particles underrepresented in training evidence

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Aged-MP review: weathering breaks library-match assumptions; few primaries train on **environmentally aged** aquatic field samples. |
| **Evidence** | W4393943493 (R); W4362015000 weathered SloPP-E lab only (P) |

### GAP-ENV-02 — Corona / matrix effects on spectroscopy under-captured in ML pipelines

| | |
| --- | --- |
| **Priority** | P3 |
| **Gap** | Corona review (toxicity lens) not linked to detection-model retraining practices in primaries. |
| **Evidence** | W4290026722 (R) |

### GAP-ENV-03 — WWTP removal efficiency highly heterogeneous in reviews

| | |
| --- | --- |
| **Priority** | P3 |
| **Gap** | Cited WWTP MP removal **1.8–54.5%** depending on treatment — weak single-number policy guidance. |
| **Evidence** | W4296114416 (R) |

---

## 7. Access, corpus, and evidence base

### GAP-CORPUS-01 — Full-text access barrier (85 excluded at FT stage)

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | **85** forwards lacked retrievable OA PDF/HTML under project rules; central 2025 ML-MP review (W4409887007) not obtained. |
| **Evidence** | manifest `excluded_full_text=85`; W4409887007 (F) |

### GAP-CORPUS-02 — High-impact paywalled primaries (e.g. fish HSI+SVM)

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Fish-intestine HSI rapid workflow (W2936115560) paywalled — blocks replication of cited **6-minute** pipeline. |
| **Evidence** | W2936115560; manifest sources |

### GAP-CORPUS-03 — Included FT not yet extracted (O-PTIR)

| | |
| --- | --- |
| **Priority** | P3 |
| **Gap** | W4308496878 included with summary but absent from `extraction.csv` — spectroscopy comparison underused in structured tables. |
| **Evidence** | W4308496878; modality-map supplement |

---

## 8. Open science and operations

### GAP-OPS-01 — Open training data rarely released

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | Most extraction rows `open_data=unverified`; MARIDA exception is **macro debris**, not MPs. |
| **Evidence** | extraction.csv; W4205835860 |

### GAP-OPS-02 — Five-polymer / class-limited models vs environmental diversity

| | |
| --- | --- |
| **Priority** | P2 |
| **Gap** | µ-Raman DL limited to **five polymers**; nanoplastic RF covers 24 types but lab-spiked. |
| **Evidence** | W4366815281 vs W4382931577 (P) |

---

## 9. Gap summary matrix (for Phase 8–9)

| Theme | P1 gaps | Key `paper_id` anchors |
| --- | --- | --- |
| LATAM / Colombia field validation | GEO-01, GEO-02 | W4392657594, W4408220111 |
| In-situ aquatic monitoring | FIELD-01, FIELD-02 | W4396828529, W4383534319 |
| Low-cost verified deployment | EDGE-01, EDGE-02 | W4391319604, W4296114416 |
| MP vs macro litter | SCALE-01 | W4291123479, W4205835860 |
| Benchmarks & metrics | METHOD-01, METHOD-02 | W4205835860, W4391755619 |
| Evidence access | CORPUS-01, CORPUS-02 | W4409887007, W2936115560 |

---

## Suggested research directions (not exhaustive)

1. **Open aquatic MP CV benchmark** with LATAM river/coastal sites, MP size class labels, and shared mAP/mIoU protocol (addresses METHOD-01, SCALE-01).  
2. **Tiered monitoring stack** — phone/edge vision for litter hot-spots + portable Raman/FTIR hub for polymer confirmation (EDGE-01, EDGE-02).  
3. **Colombia (or Andean) pilot** pairing UV/flow imaging with spectroscopic ground-truth (GEO-01).  
4. **Field-scale validation** of microfluidic or in-situ Raman prototypes with n&gt;100 particles and wet-season matrices (FIELD-02, FIELD-01).  
5. **Aged-particle training corpora** for spectroscopy+ML under tropical weathering (ENV-01, W4393943493).

---

## Source files

- [claims.jsonl](./claims.jsonl) · [contradictions.md](./contradictions.md) · [modality-map.md](./modality-map.md)  
- [reviews-synthesis.md](./reviews-synthesis.md) · `corpus/structured/extraction.csv`  
- Open questions backlog: [questions.md](../../questions.md) (Phase 7.9)


# Part VI — Open questions


# Open questions — microplastics ML detection SLR

Track gaps and contradictions during synthesis. One item per question; cite `paper_id` where relevant.  
**Phase 7.9** (2026-05-18): 14 items — **6 resolved**, **8 open**. See [gap-list.md](./outputs/knowledge/gap-list.md) for gap IDs.

---

## Resolved (this corpus pass)

### Q1 — Does the harvested corpus meet the factory paper-count target?

**Status:** Resolved  
**Answer:** Yes. `papers.jsonl` holds **228** unique records (target ≥120). `manifest.json` stats align.  
**Refs:** `manifest.json`, Phase 1–2 spine.

---

### Q2 — Is there an obtained-full-text primary with a Colombia aquatic field site?

**Status:** Resolved (negative)  
**Answer:** No. The only Colombia-tagged extraction row is **W4392657594** (author affiliation; abstract-only, no field site in available text). No other included primary names Colombia as a sampling location.  
**Refs:** W4392657594; GAP-GEO-01; claims batch C.

---

### Q3 — Can riverine YOLO mAP be reported as microplastic detection performance?

**Status:** Resolved (no)  
**Answer:** No. **W4291123479** targets **macro solid waste** on an India urban river (mAP 89%); task and size class differ from MP monitoring.  
**Refs:** W4291123479; metrics-legend §7; GAP-SCALE-01.

---

### Q4 — Why do precision and mAP diverge sharply on the Thailand UV MP detector?

**Status:** Resolved (explain, not error)  
**Answer:** **W4391755619** reports **85.5–87.8% precision** on classified boxes but **mAP 33.9–35.7%** — precision is conditional on detections; mAP penalises missed small objects. Report both.  
**Refs:** W4391755619; contradictions §1; metrics-legend.

---

### Q5 — How many extraction rows claim `global_south=yes` vs show Global South field geography in vision primaries?

**Status:** Resolved  
**Answer:** **12** rows `global_south=yes` in extraction; **four** obtained-FT vision primaries combine GS **field site + model** (India W4291123479, Thailand W4391755619, Cambodia W3091414454, China W3204790372).  
**Refs:** manifest `extraction_global_south_yes`; GAP-GEO-02.

---

### Q6 — Do factory targets for claims and extraction rows meet minimums?

**Status:** Resolved  
**Answer:** Yes. **61** claims (≥60); **37** extraction rows (≥25); **31** summaries (≥25); **25** OA PDFs (≥20).  
**Refs:** `claims.jsonl`, `extraction.csv`, `manifest.json` stats.

---

## Open (for Phase 8–9 synthesis and follow-up)

### Q7 — Should W4308496878 be added to `extraction.csv`?

**Status:** Open  
**Context:** Included full text + summary exist; O-PTIR+Raman comparison and 1.4% visual-ID cite are synthesis-relevant but missing from structured extraction.  
**Refs:** W4308496878; GAP-CORPUS-03.

---

### Q8 — Can OA full text be obtained for the 2025 ML+MP systematic review (W4409887007)?

**Status:** Open  
**Context:** Forward review flagged as central taxonomy anchor; paywalled / not retrieved. Would strengthen method landscape section.  
**Refs:** W4409887007; GAP-CORPUS-01; discovery backlog CORPUS items.

---

### Q9 — What monitoring architecture is defensible for Colombia (or Andean LATAM) without local field validation?

**Status:** Open  
**Context:** No CO field primary; transferable options may include spectroscopy hub + litter CV triage (W4291123479, W4200249418) or Brazil beach RS abstract (W4408220111) — needs explicit transfer-risk framing in Phase 9.  
**Refs:** GAP-GEO-01; Phase 9 spine.

---

### Q10 — Is a tiered stack (edge vision + portable spectroscopy) supported by evidence or only by narrative?

**Status:** Open  
**Context:** Edge flags sparse in verified FT (W4391319604); capex spectroscopy strong (W4366815281, W4200249418). Tiered model is a **synthesis proposal**, not a corpus finding.  
**Refs:** GAP-EDGE-01, GAP-EDGE-02; gap-list §Suggested directions #2.

---

### Q11 — Which single metric family should headline the SLR results tables?

**Status:** Open (Phase 8 dependency)  
**Context:** metrics-legend recommends **three tables** (spectral / vision / counting) — confirm in `slr.md` draft and avoid merged rankings.  
**Refs:** metrics-legend.md §5.

---

### Q12 — Are abstract-only LATAM forwards (W4408220111, W4408550134, W3196128465) cited as primary evidence in `slr.md`?

**Status:** Open (policy)  
**Context:** Protocol allows forward papers; factory rule says no fabricated metrics. Default: **gap-map / future work only** unless FT obtained.  
**Refs:** W4408220111, W4408550134, W3196128465; extraction abstract-only rows.

---

### Q13 — Does the corpus support recommending phone-microscope YOLO (W4409162823) for aquatic monitoring?

**Status:** Open  
**Context:** Consumer-product matrix, $10 attachment cited in abstract; not aquatic field.  
**Refs:** W4409162823; GAP-EDGE-02.

---

### Q14 — Should future harvests split `global_south` into `field_site_gs` vs `author_affiliation_gs`?

**Status:** Open (methods)  
**Context:** Current single flag conflates geography and authorship (Q5). Protocol change would improve LATAM deployment claims.  
**Refs:** protocol.md; GAP-GEO-02.

---

## Index to gap-list IDs

| Question | Gap ID (if any) |
| --- | --- |
| Q2, Q9 | GAP-GEO-01 |
| Q5, Q14 | GAP-GEO-02 |
| Q7 | GAP-CORPUS-03 |
| Q8 | GAP-CORPUS-01 |
| Q3 | GAP-SCALE-01 |
| Q4 | (contradictions §1) |
| Q10 | GAP-EDGE-01, GAP-EDGE-02 |
| Q13 | GAP-EDGE-02 |


# Part VII — PRISMA flow


# PRISMA flow — microplastics ML detection SLR

**Status:** PRISMA counts locked for manuscript (Phase 8.16). Counts match `manifest.json` → `stats` (search lock **2026-05-18**). See [harvest-status-4.9-log.md](./harvest-status-4.9-log.md).

**Search lock date:** 2026-05-18  
**Databases:** OpenAlex (see [protocol.md](../../protocol.md))

## Flow counts

| Stage | Count | `manifest.stats` key |
| --- | ---: | --- |
| Records identified (all queries) | 228 | `identified` |
| Records after deduplication | 228 | `after_dedupe` |
| Records screened (title/abstract) | 228 | `screened_title_abstract` |
| Records excluded (title/abstract) | 112 | `excluded_title_abstract` |
| Records forwarded to full text | 116 | `forward_fulltext` |
| Full-text articles sought | 116 | `full_text_sought` |
| Full-text articles obtained (PDF or PMC HTML/txt) | 31 | `full_text_obtained` |
| OA PDFs in `corpus/raw/` | 25 | `oa_pdfs_in_raw` |
| Full-text HTML/normalized (non-PDF) | 6 | `full_text_html_obtained` |
| Full-text articles not retrieved | 85 | `full_text_not_retrieved` |
| Full-text pending | 0 | `full_text_pending_retrieval` |
| Studies summarised (all obtained FT) | 31 | `included` / `summaries_md` |
| Studies excluded at full text (access) | 85 | `excluded_full_text` |
| Structured extraction rows | 37 | `extraction_rows` |

## Diagram (text)

```text
IDENTIFICATION
  Records identified via OpenAlex (7 query strings; 2019–2025 lock)     n = 228
  Records after deduplication (DOI / OpenAlex ID)                      n = 228

SCREENING
  Records screened (title/abstract)                                    n = 228
    Records excluded (title/abstract)                                  n = 112
    Records forwarded to full text                                     n = 116

ELIGIBILITY (full text)
  Full-text articles sought                                            n = 116
    Full-text articles obtained                                        n =  31
      (25 OA PDF in corpus/raw/; 6 HTML/XML or normalized only)
    Full-text articles not retrieved                                   n =  85
      Paywalled                                                        n =  30
      OA download failed                                               n =  15
      OA not retrieved after sought                                    n =  40

INCLUDED (systematic map synthesis)
  Narrative summaries (corpus/summaries/)                              n =  31
    Primary-style articles (performance synthesis)                     n =  22
    Reviews (gap analysis only)                                        n =   9
  Structured extraction rows (extraction.csv; incl. 6 forward-only)    n =  37
  Forwarded without obtained FT (bibliography + unverified metrics)    n =  85
```

Rendered in manuscript: [slr.md §2.2.4](../../outputs/reports/slr.md).

## Not retrieved breakdown (n=85)

From `papers.jsonl` where `harvest_status: fulltext_unavailable`:

| Reason | n |
| --- | ---: |
| `oa_not_retrieved` | 40 |
| `paywalled` | 30 |
| `oa_download_failed` | 15 |

## Title/abstract exclusions (n=112)

See [screening-rationale.md](../../outputs/knowledge/screening-rationale.md) and buckets in prior revision.

## Notes

- This project is a **systematic map**: all **31** obtained full texts were summarised; there was no second-wave exclusion of obtained papers from synthesis for quality scoring.
- **85** forwards without retrievable open full text remain in `papers.jsonl` and the References bibliography (Section 2.2.2) but do not supply verified performance tables.
- Phase 4.9 closed pending FT retrieval (`full_text_pending_retrieval: 0`).
- PDF logs: [fulltext-batch-a-log.md](./fulltext-batch-a-log.md), [fulltext-batch-b-log.md](./fulltext-batch-b-log.md), [fulltext-batch-c-log.md](./fulltext-batch-c-log.md).
- TA exclusion buckets: [screening-rationale.md](../../outputs/knowledge/screening-rationale.md).


# Part VIII — Protocol


# Protocol

## Title (working)

Systematic map of machine learning and computer vision for microplastic detection in aquatic matrices (2019–2025): performance, modalities, and deployment gaps in resource-limited settings.

## Date range

Publications **2019-01-01** through search lock date (set in `manifest.json` when harvest completes).

## Databases (pilot)

- [OpenAlex](https://openalex.org/) (primary for metadata harvest)

## Search strings (OpenAlex)

Run separately; dedupe by DOI in `manifest.json`:

1. `microplastic detection machine learning CNN`
2. `microplastic computer vision deep learning water`
3. `microplastic Raman spectroscopy machine learning`
4. `microplastic hyperspectral imaging classification`
5. `microplastic YOLO detection`

Filters: `from_publication_date:2019-01-01`, `type:article|review`.

## Inclusion

- Microplastic(s) or nanoplastic(s) in **aquatic** context (water, wastewater, marine/freshwater, sediment with explicit detection pipeline)
- **ML/DL/CV** or automated classification/counting
- Reports **performance** or identifiable limitation of the method

## Exclusion

- Food-web / toxicity only (no detection method)
- Polymer chemistry only (no identification pipeline)
- Policy-only or citizen science without a model
- Lab synthesis papers with no detection performance metrics

## Tag separately

- Existing **systematic reviews** → gap map only; do not count as primary studies in extraction table.

## Extraction fields

See `corpus/structured/extraction-template.csv`.

## Global South / LATAM lens

Flag papers with: author affiliation, study site, or stated low-cost/edge deployment in resource-limited settings. Expect sparse coverage; document as a gap.


# Part IX — Metrics legend


# Metrics legend — how to compare reported performance

**Phase:** 7.7 | **Date:** 2026-05-18  
**Use with:** [contradictions.md](./contradictions.md) · [glossary.md](./glossary.md) · `corpus/structured/extraction.csv`

This corpus mixes **computer-vision detection/segmentation**, **spectral polymer classification**, and **counting/metrology** metrics. **Do not rank papers on unlike numbers.** Group by task family first.

---

## 1. Task families in this SLR

| Family | Question answered | Typical metrics | Example `paper_id` |
| --- | --- | --- | --- |
| **A. Spectral polymer ID** | Is this particle polymer X? | Accuracy, precision, recall, sensitivity, specificity, κ, AUC | W4200249418, W4366815281, W4382931577 |
| **B. Image segmentation** | Which pixels are particle/mask? | mIoU, mF1, Dice, per-class IoU | W4282979647, W4205835860 |
| **C. Object detection** | Where are objects (boxes)? | mAP, AP@0.5, precision, recall, F1 | W4291123479, W4391755619 |
| **D. Detection gains** | How much better than baseline? | ΔmAP, ΔAP per scale (not absolute) | W4321194910, W3204790372 |
| **E. Pixel/scene classification** | Is this pixel/scene plastic/debris? | Accuracy (often scenario-specific) | W4380082849, W3155690422 |
| **F. Counting / quantification** | How many particles / concentration? | Count error %, particles/m³, Bland-Altman bias | W3003736709, W4213300830, W4391319604 |
| **G. Non-ML / method comparison** | Which instrument agrees? | Agreement, cited literature rates | W4308496878 |

Families **A** and **B–E** are **not interchangeable**. Family **F** needs explicit unit and reference method.

---

## 2. Metric definitions (reporting lens)

### Classification (spectral or image-level)

| Metric | Meaning | Compare when |
| --- | --- | --- |
| **Accuracy** | (TP+TN) / all predictions | Same #classes, same split, same matrix; watch **class imbalance** |
| **Precision** | TP / (TP+FP) — positive predictions that are correct | Detection **after** boxes proposed; or per polymer class |
| **Recall / sensitivity** | TP / (TP+FN) — fraction of true positives found | Pair with precision; polymer-specific recalls differ (W4200249418) |
| **Specificity** | TN / (TN+FP) | Common in medical-style spectral papers (W4382931577) |
| **F1** | 2PR/(P+R) | Single threshold; **mF1** = mean over classes (W4282979647) |
| **Cohen κ (kappa)** | Agreement beyond chance | μFTIR multi-class RDF (W4200249418) |
| **AUC / ROC-AUC** | Rank quality across thresholds | Microfluidic CNN vs SVM (W4383534319); not same as accuracy at one threshold |

### Object detection (bounding boxes)

| Metric | Meaning | Compare when |
| --- | --- | --- |
| **mAP** | Mean AP averaged over classes and IoU thresholds (COCO-style) | Same dataset, same IoU policy; **low mAP + high precision** possible (W4391755619) |
| **AP@0.5** | AP at single IoU 0.5 | Easier than COCO mAP; check if paper uses COCO or PASCAL protocol |
| **Precision (detector)** | Among predicted boxes, fraction correct | **Not** comparable to spectral precision without defining “correct box” |
| **Recall (detector)** | Fraction of ground-truth objects found | W4291123479 reports 86% recall with mAP 89% |

**Rule:** Reporting only **precision 85%** from W4391755619 **without mAP ~34%** misstates field MP detection performance.

### Segmentation (masks)

| Metric | Meaning | Compare when |
| --- | --- | --- |
| **IoU / Jaccard** | Intersection over union per mask | Same resolution and label definition |
| **mIoU** | Mean IoU over classes | MARIDA mean 0.57 hides 0.02–1.0 spread (W4205835860) |
| **mRecall / mF1** | Means over classes | MP-Net: best F1 architecture ≠ best recall architecture (W4282979647) |

### Counting & agreement

| Metric | Meaning | Compare when |
| --- | --- | --- |
| **Count error %** | Deviation from manual/reference count | SMACC 1.4% on lab sediment (W3003736709) |
| **Classification error %** | Misclassified particle types after segmentation | SMACC &lt;4% except line-shaped class |
| **Bland-Altman bias SD** | Systematic count bias vs reference | Polarization holographic flow (W4391319604) |
| **Concentration** | particles/m³ etc. | Po River 1.89–8.22 (W4213300830); not a classifier metric |

### Relative / incremental reports

| Metric | Meaning | Compare when |
| --- | --- | --- |
| **+X% mAP vs baseline** | Improvement only | W4321194910, W3204790372 — need baseline absolute values to interpret |
| **Accuracy scenario 1 vs 2** | Different label definitions | W4380082849: 98% vs 83% — **always name scenario** |

---

## 3. What you may compare (green / amber / red)

| Comparison | Verdict | Guidance |
| --- | --- | --- |
| mAP vs mAP, same task (litter OD) | 🟢 | W4291123479 vs W4391755619 only if both are detection; note MP vs macro |
| mIoU vs mIoU, segmentation | 🟢 | W4282979647 vs W4205835860 — different domains (microscopy vs satellite) → describe separately |
| Accuracy vs accuracy, spectral ID | 🟢 | W4362015000 vs W4382931577 — note lab matrix, #polymer classes, sample n |
| Precision (detector) vs mAP | 🟡 | Same paper OK side-by-side; **never** pick the higher as “headline” |
| Accuracy (98%) vs mAP (34%) | 🔴 | Different constructs — see W4391755619 |
| Spectral accuracy vs YOLO mAP | 🔴 | Different families (A vs C) |
| Satellite accuracy vs microscopy F1 | 🔴 | Different scale (macro vs MP) |
| Review-cited % vs primary table | 🟡 | Reviews (W4318615471) secondary; trace to primary if possible |
| `unverified` extraction value | 🟡 | Do not treat as verified until FT check (Phase 6.12) |

---

## 4. Corpus metric map (primary rows, obtained FT emphasised)

| `paper_id` | Family | Reported metrics (from extraction) | MP-relevant? |
| --- | --- | --- | --- |
| W3003736709 | F | count_error 1.4%; class_error &lt;4% | Yes (lab MPs) |
| W4282979647 | B | mF1 0.736; mIoU 0.617; mRecall 0.883 | Yes (stained MPs) |
| W4391319604 | F | accuracy up to 96%; Bland-Altman SD | Yes (flow MPs) |
| W4383534319 | A | accuracy &gt;93%; AUC 0.98 | Yes (small MPs; n=5 field) |
| W4362015000 | A | accuracy 93.81% | Yes (weathered polymers) |
| W4366815281 | A | precision ≥97.1%; recall ≥99.4% | Yes (5 polymers) |
| W4382931577 | A | accuracy 98.8%; sens/spec | Yes (nanoplastics lab) |
| W3172017684 | A | CNN ~97% | Unclear MP subset |
| W4200249418 | A | accuracy 0.9766; κ 0.9690; per-polymer sens | Yes |
| W4213300830 | F | particles/m³; per-level metrics unverified | Yes (freshwater) |
| W4380082849 | E | accuracy 98% / 83% / 75% by scenario | Macro debris |
| W4205835860 | B | avg IoU 0.57 | Macro debris |
| W3155690422 | E | ~86% accuracy; 8% pixel coverage target | Macro litter |
| W4291123479 | C | mAP 89%; F1 0.8; recall 86% | Macro litter |
| W4391755619 | C | precision 85–88%; mAP 34–36% | MPs (field UV) |
| W4321194910 | D | ΔAP +1.2–2.6% | Macro debris |
| W3204790372 | D | +9.6% mAP; +5% seg (unverified) | Macro litter |
| W3091414454 | E | accuracy up to 83%; PLQ 60–71% | Macro litter |

Abstract-only / unverified rows: see `extraction.csv` — report with `unverified` tag in `slr.md`.

---

## 5. Recommended tables for `slr.md` (Phase 8)

Split into **at least three** tables—do not merge:

1. **Spectral ML ID** — columns: `paper_id`, matrix, #classes, metric(s), n, lab/field  
2. **Vision detection/segmentation** — columns: `paper_id`, task, mAP/mIoU/F1, scale (macro/MP), field/lab  
3. **Throughput / counting** — columns: `paper_id`, unit, reference method, bias/limit

Footnote every cell with **scenario** (e.g. W4380082849 scenario 2) or **class subset** (five polymers only).

---

## 6. Reporting checklist (per paper)

- [ ] Task family (A–G) stated  
- [ ] Metric definitions match paper (COCO mAP vs custom accuracy)  
- [ ] Train/test split and n cited or marked `unverified`  
- [ ] Macro litter vs MP size class explicit  
- [ ] Lab vs field explicit  
- [ ] If multiple metrics, report **all** salient ones (precision **and** mAP when both exist)  
- [ ] Reviews labelled secondary — no performance ranking from review aggregates alone  

---

## 7. Quick reference: misleading headlines to avoid

| Do not write | Write instead |
| --- | --- |
| “Detection accuracy up to 98%” (satellite) | “Pixel-level debris classification 98% in scenario 1; 83% for plastic pixels; 10 m resolution” (W4380082849) |
| “87% precision MP detection” | “87% precision on detected UV boxes; mAP 34% (small objects)” (W4391755619) |
| “State-of-the-art YOLO for microplastics” | “YOLO mAP 89% on **riverine macro solid waste**, India” (W4291123479) |
| “Raman ML near 100% in environment” | “≥97% precision/recall on **five polymers**, 10.7% samples with polymer hits” (W4366815281) |

---

## Source files

- Metric strings: `corpus/structured/extraction.csv` → `metrics` column  
- Tensions: [contradictions.md](./contradictions.md) §1, §7  
- Verified subset: `corpus/structured/extraction-metrics-verify-6.11-log.md`


# Part X — Modality map


# Modality map — microplastics ML detection SLR

**Status:** Phase 7.5 populated.  
Maps each **detection modality** to `paper_id` links. Primary rows from `corpus/structured/extraction.csv` (37 papers); supplements note included full text outside extraction and key forwards.

**Legend:** 📄 = [summary](../../corpus/summaries/{id}.md) available · 🔗 = `papers.jsonl` forward only · ⭐ = included full text, not in `extraction.csv`

---

## High-level clusters (primary `is_review: false` only)

| Cluster | `paper_id` (linked) | n |
| --- | --- | ---: |
| **Microscopy & lab imaging** | [W3003736709](../../corpus/summaries/W3003736709.md) · [W4282979647](../../corpus/summaries/W4282979647.md) · [W4391319604](../../corpus/summaries/W4391319604.md) · [W4383534319](../../corpus/summaries/W4383534319.md) · W4409162823 | 4+1 |
| **Raman / SERS + ML** | [W4362015000](../../corpus/summaries/W4362015000.md) · [W4366815281](../../corpus/summaries/W4366815281.md) · [W4382931577](../../corpus/summaries/W4382931577.md) · [W3172017684](../../corpus/summaries/W3172017684.md) · [W4304690559](../../corpus/summaries/W4304690559.md) · W4392657594 · W4404688861 · W3196128465 | 5+3 |
| **FTIR + ML** | [W4200249418](../../corpus/summaries/W4200249418.md) · W4408550134 | 1+1 |
| **Hyperspectral / multispectral** | [W4213300830](../../corpus/summaries/W4213300830.md) · [W3155690422](../../corpus/summaries/W3155690422.md) · [W4380082849](../../corpus/summaries/W4380082849.md) · [W4205835860](../../corpus/summaries/W4205835860.md) · W4408220111 | 4+1 |
| **RGB / YOLO / object detection** | [W4291123479](../../corpus/summaries/W4291123479.md) · [W3091414454](../../corpus/summaries/W3091414454.md) · [W4391755619](../../corpus/summaries/W4391755619.md) · [W4321194910](../../corpus/summaries/W4321194910.md) · [W3204790372](../../corpus/summaries/W3204790372.md) · W4400418758 · W4385454320 | 5+2 |
| **Hybrid / metrology / other** | [W4385411640](../../corpus/summaries/W4385411640.md) · ⭐ [W4308496878](../../corpus/summaries/W4308496878.md) | 2 |

*n includes abstract-only forwards without 📄.*

---

## Extraction modalities (`extraction.csv` → `modality`)

Source: `corpus/structured/extraction-by-modality.csv` (34 modality keys, 37 rows).

### Spectroscopy

| `modality` | Years | Papers | Summary | GS | Edge |
| --- | --- | --- | --- | --- | --- |
| `raman_spectroscopy` | 2023–2024 | [W4362015000](../../corpus/summaries/W4362015000.md) · [W4382931577](../../corpus/summaries/W4382931577.md) · W4392657594 | 2📄 | 1 | — |
| `raman_microspectroscopy` | 2023 | [W4366815281](../../corpus/summaries/W4366815281.md) | 📄 | — | — |
| `raman_spectroscopy_in_situ` | 2021 | [W3172017684](../../corpus/summaries/W3172017684.md) | 📄 | — | — |
| `raman_SERS_bench` | 2022 | [W4304690559](../../corpus/summaries/W4304690559.md) | 📄 | — | — |
| `raman_in_situ_prototype` | 2024 | W4404688861 | 🔗 | — | — |
| `ftir_microspectroscopy_imaging` | 2021 | [W4200249418](../../corpus/summaries/W4200249418.md) | 📄 | — | — |
| `ftir_spectroscopy` | 2025 | W4408550134 | 🔗 | yes | — |
| `vibrational_spectroscopy` | 2021 | W3196128465 | 🔗 | yes | — |

### Imaging — microscopy & microfluidics

| `modality` | Years | Papers | Summary | GS | Edge |
| --- | --- | --- | --- | --- | --- |
| `microscopy_lab_imaging` | 2020 | [W3003736709](../../corpus/summaries/W3003736709.md) | 📄 | — | — |
| `fluorescence_microscopy` | 2022 | [W4282979647](../../corpus/summaries/W4282979647.md) | 📄 | — | — |
| `polarization_holographic_microscopy` | 2024 | [W4391319604](../../corpus/summaries/W4391319604.md) | 📄 | — | yes |
| `microfluidic_optical_imaging` | 2023 | [W4383534319](../../corpus/summaries/W4383534319.md) | 📄 | — | — |
| `phone_microscopy_rgb` | 2025 | W4409162823 | 🔗 | — | yes |
| `methods_primer_image_metrology` | 2023 | [W4385411640](../../corpus/summaries/W4385411640.md) | 📄 | — | — |

### Hyperspectral & remote sensing

| `modality` | Years | Papers | Summary | GS | Edge |
| --- | --- | --- | --- | --- | --- |
| `hyperspectral_imaging_SWIR` | 2022 | [W4213300830](../../corpus/summaries/W4213300830.md) | 📄 | — | — |
| `hyperspectral_PRISMA_pansharpened_S2` | 2021 | [W3155690422](../../corpus/summaries/W3155690422.md) | 📄 | — | — |
| `multispectral_satellite_S2` | 2022–2023 | [W4380082849](../../corpus/summaries/W4380082849.md) · [W4205835860](../../corpus/summaries/W4205835860.md) | 2📄 | — | — |
| `remote_sensing_GNSS_Raman_ML` | 2025 | W4408220111 | 🔗 | yes | — |

### RGB computer vision (detection / segmentation)

| `modality` | Years | Papers | Summary | GS | Edge |
| --- | --- | --- | --- | --- | --- |
| `rgb_object_detection` | 2022 | [W4291123479](../../corpus/summaries/W4291123479.md) | 📄 | yes | — |
| `rgb_drone_aerial` | 2020 | [W3091414454](../../corpus/summaries/W3091414454.md) | 📄 | yes | — |
| `rgb_object_detection_uv` | 2024 | [W4391755619](../../corpus/summaries/W4391755619.md) | 📄 | yes | yes |
| `rgb_object_detection_auv` | 2023 | [W4321194910](../../corpus/summaries/W4321194910.md) | 📄 | — | — |
| `rgb_object_detection_underwater` | 2021 | [W3204790372](../../corpus/summaries/W3204790372.md) | 📄 | yes | yes |
| `rgb_instance_segmentation_underwater` | 2023 | W4385454320 | 🔗 | — | — |
| `rgb_ai_camera_edge` | 2024 | W4400418758 | 🔗 | — | yes |

### Reviews in `extraction.csv` (gap-map only — not primary performance)

| `modality` | Papers |
| --- | --- |
| `review_synthesis_spectroscopy` | [W4296114416](../../corpus/summaries/W4296114416.md) |
| `review_synthesis_ID_methods` | [W3134265767](../../corpus/summaries/W3134265767.md) |
| `review_synthesis_marine` | [W3122508379](../../corpus/summaries/W3122508379.md) |
| `review_hsi_plastic_waste` | [W4318615471](../../corpus/summaries/W4318615471.md) |
| `review_bibliometric_imaging` | [W4313826580](../../corpus/summaries/W4313826580.md) |
| `review_occurrence_mitigation` | [W4210266455](../../corpus/summaries/W4210266455.md) |
| `review_aged_MP_analytics` | [W4393943493](../../corpus/summaries/W4393943493.md) |
| `review_corona_risk` | [W4290026722](../../corpus/summaries/W4290026722.md) |
| `review_textile_washing` | [W4382940669](../../corpus/summaries/W4382940669.md) |

---

## Included full text outside `extraction.csv`

| `paper_id` | Modality (assigned) | Summary | Role |
| --- | --- | --- | --- |
| ⭐ [W4308496878](../../corpus/summaries/W4308496878.md) | O-PTIR + simultaneous Raman | 📄 | Spectroscopy method comparison; visual-ID failure cite |
| [W2982912960](../../corpus/summaries/W2982912960.md) | *(peripheral)* | 📄 | Structural metals review — not MP CV |

---

## Forward papers (not extracted; HSI / in-situ gap map)

| `paper_id` | Modality (title/abstract) | FT |
| --- | --- | --- |
| W2936115560 | Fish-intestine HSI + SVM | paywalled |
| W4385737119 | HSI shape taxonomy (11,042 particles) | forward |
| W4414305742 | HSI shape DL (nine CNN architectures) | forward |
| W4409887007 | ML + spectroscopy systematic review | forward |
| W4396828529 | In-situ aquatic MP systematic review | forward |

---

## Matrix overlay

Rows list **primary** extraction papers (`review_count = 0` in modality CSV) by aquatic-relevant matrix.

| Matrix | Modalities present | `paper_id` examples |
| --- | --- | --- |
| **Freshwater / river** | HSI SWIR; RGB river OD | [W4213300830](../../corpus/summaries/W4213300830.md) · [W4291123479](../../corpus/summaries/W4291123479.md) |
| **Marine / coastal** | Satellite S2; underwater OD; UV Faster R-CNN; drone | [W4380082849](../../corpus/summaries/W4380082849.md) · [W4205835860](../../corpus/summaries/W4205835860.md) · [W3204790372](../../corpus/summaries/W3204790372.md) · [W4391755619](../../corpus/summaries/W4391755619.md) · [W3091414454](../../corpus/summaries/W3091414454.md) |
| **Seawater / surface water** | Microfluidic; SERS bench | [W4383534319](../../corpus/summaries/W4383534319.md) · [W4304690559](../../corpus/summaries/W4304690559.md) |
| **Wastewater / environmental catchment** | µ-Raman DL | [W4366815281](../../corpus/summaries/W4366815281.md) |
| **Multi-matrix environmental (lab)** | μFTIR imaging RDF | [W4200249418](../../corpus/summaries/W4200249418.md) |
| **Beach / sediment (lab)** | Microscopy SMACC | [W3003736709](../../corpus/summaries/W3003736709.md) |
| **Brazil field (LATAM)** | Remote sensing + ML | W4408220111 |
| **Non-aquatic / consumer** | Phone microscopy | W4409162823 |

---

## Counts (primary evidence)

| Bucket | Papers in `extraction.csv` | With 📄 summary |
| --- | ---: | ---: |
| Spectroscopy + ML (Raman/FTIR/vibrational) | 11 | 6 |
| Microscopy / microfluidics / fluorescence / holographic | 6 | 5 |
| Hyperspectral / satellite / remote | 5 | 4 |
| RGB object detection / segmentation | 7 | 5 |
| **Primary total** | **28** | **20** |
| Reviews in extraction | 9 | 9 |
| ⭐ Supplement (included FT) | 1 (+1 peripheral) | 2 |

---

## Notes

- **GS** = `global_south: yes` in extraction; **Edge** = `edge_low_cost: yes`.
- Macro litter vs MP: RGB and satellite clusters mostly target **macro** debris; spectroscopy/microscopy target **MP** size classes.
- Regenerate modality roll-up from CSV: `corpus/structured/extraction-by-modality.csv`.
- Cross-reference: [glossary.md](./glossary.md) · [claims.jsonl](./claims.jsonl) · [reviews-synthesis.md](./reviews-synthesis.md).


# Part XI — Screening rationale (edge cases)


# Title/abstract screening rationale

**Phase:** 3.13  
**Search lock:** 2026-05-18  
**Corpus:** 228 records screened; 116 `forward_fulltext`; 112 `excluded`; 0 `maybe` (see [maybe-second-pass-log.md](../../corpus/structured/maybe-second-pass-log.md)).

This document records **edge-case rules** applied at title/abstract screening. Per-record decisions remain authoritative in `corpus/structured/screening-log.csv`. Inclusion criteria: [protocol.md](../../protocol.md).

---

## Decision flow (title/abstract)

```text
Candidate record
  ├─ Retracted / clearly unrelated topic? → excluded
  ├─ MPs + ML/CV/spectroscopy automation + aquatic matrix in title/abstract? → forward_fulltext
  ├─ Systematic/scoping review on MP detection, spectroscopy, or aquatic ML? → forward_fulltext (gap map; not primary count)
  ├─ MPs + ML but matrix is soil, air, food, blood, fashion, landfill livestock, etc.? → excluded
  ├─ Aquatic MPs but no automated detection / performance pipeline? → excluded
  └─ Otherwise → excluded (generic off-topic or insufficient signal in abstract)
```

**Aquatic matrix** (any one sufficient in title/abstract): marine, ocean, sea, river, lake, wastewater, drinking water, seawater, sediment, underwater, floating debris in water, aqueous environment, shoal/simulated water column.

**ML/detection pipeline** (any one sufficient): machine/deep learning, CNN, YOLO, segmentation, classification model, Raman/FTIR/HSI with automated identification, boosting on spectral features, human–machine teaming for spectra.

---

## Exclusion buckets (n=112)

Counts align with [prisma-flow.md](../../corpus/structured/prisma-flow.md); examples below are illustrative, not exhaustive.

### 1. MP + ML but non-aquatic matrix (n≈28)

Spectroscopy or CV on plastics without water/wastewater/marine context in title/abstract.

| `paper_id` | Rationale |
| --- | --- |
| W2998576579 | Atmospheric MP deposition; not aquatic sampling |
| W3037625619 | Transparent plastic granulate sorting (industrial) |
| W4290930152 | MP extraction from soils |
| W4311421434 | Farmland soil HSI MPs |
| W4310871764 | Polystyrene counting in carp **blood** (toxicology biodistribution, not water matrix) |
| W4408680321 | MP screening in **soils** (SWIR HSI) |

**Rule:** Laboratory polymer ID on dry/soil/atmospheric samples is excluded unless abstract states aquatic or wastewater sample matrix.

### 2. Aquatic MPs or environment without ML detection (n≈22)

Ecology, transport, policy, or monitoring without an automated classification/counting method.

| `paper_id` | Rationale |
| --- | --- |
| W2948539118 | Vertical distribution / biological transport; no ML |
| W2947746912 | Deep-ocean observing; not MP detection |
| W2959012558 | Integrated marine debris observing **system**; not detection ML |
| W4388571175 | MP attenuation through treatment trains; no ML model |
| W4386416320 | Marine MP pollution threats; no detection method |

### 3. Generic off-topic despite keyword harvest (n≈25)

OpenAlex queries surfaced adjacent ML or “water” papers.

| `paper_id` | Rationale |
| --- | --- |
| W4390475396 | Deep learning for phase recovery (optics), not MPs |
| W4390728072 | Cyanobacteria HAB review |
| W4365448739 | IoT pH/TDS/turbidity; no MPs |
| W4378832918 | Retracted lung-nodule CT paper |

### 4. Reviews excluded at title/abstract (n≈8)

Reviews **outside** MP detection / aquatic ML scope (fashion, e-textiles, wind farms, general ANN in unrelated domains).

| `paper_id` | Rationale |
| --- | --- |
| W4291825467 | Smart e-textiles review |
| W4307342666 | Plant toxicity review; no detection |
| W4224243654 | Offshore wind O&M review |

**Contrast:** MP-focused method reviews (Raman in water, AI imaging MPs) were **forwarded** for gap mapping (Section below).

### 5. Macro-debris / waste-site remote sensing without MP pipeline (n≈3)

| `paper_id` | Rationale |
| --- | --- |
| W4317358628 | Satellite terrestrial **waste aggregations**; not MP spectroscopy/CV |
| W4387264069 | Satellite water-quality regression meta-analysis; not MPs |

**Contrast:** Marine **floating debris** or **underwater litter** with explicit ML detectors were forwarded (macro plastic as proxy for litter pathways relevant to MP sources).

---

## Borderline **forward** decisions (full-text scrutiny)

These met title/abstract inclusion heuristics; full text may refine `included` vs `excluded` at Phase 4+.

### A. Systematic reviews and gap maps (n≈30 forwards)

Tagged `is_review: true` where applicable. Used for **gap analysis only**, not primary performance counts in extraction.

| `paper_id` | Why forwarded | Full-text note |
| --- | --- | --- |
| W4296114416 | Raman MP in water sources SLR | Extract modality gaps, not primary metrics |
| W4313826580 | AI-based microplastic imaging review | Map CV/HSI trends |
| W4409887007 | ML advancements in MP/NP detection | Central gap-map anchor |
| W3006187093 | Atmospheric MPs review | Peripheral to aquatic RQ; keep for boundary documentation |

### B. Marine debris / underwater litter CV (macro objects)

YOLO/EfficientDet/Sentinel-2 on bottles, bags, trash — not always particle-level MPs.

| `paper_id` | Why forwarded | Full-text note |
| --- | --- | --- |
| W3176625739 | Deep features for marine debris classification | Confirm metrics and matrix |
| W4205835860 | MARIDA Sentinel-2 marine debris benchmark | Dataset paper; map to CV modality |
| W4321194910 | EfficientDet real-time marine debris (AUV) | Underwater CV; deployment angle |
| W4394767335 | YOLOv8 underwater trash detection | Aquatic CV; check MP vs macro labels |

**Rule:** Forward when aquatic CV/automation targets plastic litter/debris; flag `modality=computer_vision` and distinguish macro vs micro in extraction.

### C. Lab matrices still “aquatic-relevant”

Pure water, tap water, rain, or spiked seawater — forwarded when ML spectroscopy/CV targets MP identification.

| `paper_id` | Why forwarded |
| --- | --- |
| W4304690559 | SERS PS/PE in pure water |
| W4310048683 | Raman+ML six polymers in five water types |
| W4382931577 | Nanoplastics in tap water / rainwater spikes |

### D. Non-field or peripheral matrices

| `paper_id` | Why forwarded | Caveat |
| --- | --- | --- |
| W4294957642 | HSI+DL MPs from medical masks | Environmental release framing; weak field aquatic site |
| W4409162823 | Phone microscopy MPs in consumer products | Post-oxidation lab workflow; not environmental monitoring |
| W4408220111 | MP deposits on urban beaches, **São Paulo** coast | **LATAM site** — prioritize full text and `global_south` flag |

### E. Broad wastewater / waste-management reviews

| `paper_id` | Why forwarded | Caveat |
| --- | --- | --- |
| W4295532849 | AI in wastewater treatment (comprehensive review) | May lack MP-specific models; skim for detection sections |
| W4386607939 | AI in wastewater treatment review | Same |

---

## Rules deferred to full-text screening

| Topic | Title/abstract | Full-text action |
| --- | --- | --- |
| Performance metrics | Often absent in abstract | Require reported accuracy/F1/limitations or mark `unverified` |
| `global_south` | Rare in abstract | Set `yes` only with affiliation, site, or author claim (e.g. W4408220111) |
| Reviews vs primary | `is_review` + forward | Do not add to primary extraction row counts |
| Copolymers / mixtures | Keyword hit only | Confirm ML handles mixed spectra |
| Macro vs micro CV | Forward litter detectors | Extract object size class and deployment environment |

---

## Consistency checks performed (Phase 3.10–3.12)

- Zero `maybe` in `papers.jsonl` and `screening-log.csv`.
- `forward_fulltext` = 116 (manifest `forward_fulltext_target_met: true`; target ≥35).
- Exclusion reasons are protocol-aligned strings; no blank `reason` on excluded rows.

---

## Change log

| Date | Change |
| --- | --- |
| 2026-05-18 | Initial edge-case rationale (Phase 3.13) |


# Part XII — Reviews synthesis


# Systematic reviews synthesis (`is_review: true`)

**Phase:** 5.9 | **Date:** 2026-05-18  
**Corpus:** 46 reviews in `papers.jsonl` (of 228 records)  
**Use in SLR:** Gap analysis and method landscape only — **not** primary performance evidence (per `protocol.md`).

## Summary counts

| Bucket | n | Full text obtained | Per-paper summary (`corpus/summaries/`) |
| --- | ---: | ---: | ---: |
| Forwarded (`forward_fulltext`) | 22 | 9 | 9 |
| Forwarded, FT unavailable | 13 | 0 | 0 |
| Excluded at title/abstract | 24 | 0 | 0 |

## Thematic clusters (title/abstract taxonomy)

| Cluster | n | Role for this SLR |
| --- | ---: | --- |
| ML / AI / deep learning (general or MP-focused) | 21 | Method trends, automation gaps, tool introductions |
| Raman / FTIR / hyperspectral / microspectroscopy | 5 | Spectroscopy + ML maturity; lab vs field |
| Imaging / CV / in-situ detection | 2+ | Overlap with primary CV papers; dataset scarcity |
| Risk, occurrence, corona, mitigation | 3 | Context for deployment claims, not detection metrics |
| Textile, fashion, compost, SAR, other peripheral | 15 | Boundary documentation; usually excluded or weak aquatic link |

## Full-text reviews (n=9) — integrated findings

These have `corpus/summaries/{paper_id}.md` and inform gap analysis with more confidence.

| `paper_id` | Year | Focus | SLR takeaway |
| --- | ---: | --- | --- |
| W3134265767 | 2021 | Classic vs innovative MP **identification** methods | Lab-heavy ID stack (FTIR/Raman/microscopy); few field CV pipelines |
| W4296114416 | 2022 | **Raman** for MP in water (systematic review) | RS viable for water matrices; WWTP removal stats aggregated; automation still sparse |
| W4313826580 | 2023 | **AI + microplastic imaging** bibliometric (WoS 2019–2022) | Research concentrated in imaging-AI; limited LATAM-authored imaging studies |
| W4318615471 | 2023 | **HSI** plastic waste detection | Good for sorting/recycling; sub-mm MP weak; **Malaysia** authorship (Global South) |
| W4210266455 | 2022 | MP/NP occurrence, ID, risk, mitigation | Broad pollution review; **India (Hyderabad)** context for lab capacity |
| W4290026722 | 2022 | **Coronas** on M/NPs | Risk assessment lens; corona alters toxicity — detection ≠ hazard |
| W4393943493 | 2024 | **Aged** microplastics | Weathering breaks library-match assumptions; ML needs aged training data |
| W4382940669 | 2023 | Textile washing MP reduction + **machine vision** prospects | Source reduction; MV prospective, not aquatic monitoring |
| W2982912960 | 2019 | Structural metals sustainability | **Peripheral** — retained only for screening overlap; no MP detection content |

### Cross-cutting themes (from full-text set)

1. **Spectroscopy dominates automated ID reviews** (Raman, μFTIR, HSI) while **aquatic CV** reviews are fewer and often macro-litter adjacent.
2. **Global South presence** in reviews is thin: explicit mentions include **India** (W4210266455), **Malaysia** (W4318615471); no Colombia/LATAM field programs in obtained review FT.
3. **Aging and matrix effects** (W4393943493, W4290026722) imply deployment models must report environmental conditioning, not pristine-lab accuracy alone.
4. **Reviews confirm automation gap**: human annotation / library search bottlenecks repeatedly cited; aligns with primary papers on RDF/CNN pipelines.

## Forwarded reviews without full text (n=13) — abstract-level map

Abstract-only synthesis; metrics **not** extracted to `extraction.csv`.

| `paper_id` | Year | Title (short) | Expected gap-map contribution |
| --- | ---: | --- | --- |
| W4409887007 | 2025 | ML advancements in MP/NP detection | **Anchor** SLR for method taxonomy; obtain FT when OA available |
| W4396828529 | 2024 | In-situ aquatic MP detection (systematic) | Field vs lab deployment boundary |
| W4404459247 | 2024 | Introduction to ML tools for microplastic analysis | Methods primer for practitioners |
| W4400366361 | 2024 | AI microfluidic platform (contaminant detection) | Lab-on-chip + ML intersection |
| W4387344131 | 2023 | IR/NIR microspectroscopy for MPs | Spectral ID automation |
| W4386607939 | 2023 | AI in wastewater treatment | Utility-scale monitoring context |
| W4295532849 | 2022 | Conventional + AI in water treatment | WWTP + MP overlap |
| W4292417018 | 2022 | ML meta-analysis of MP polymer composition | Polymer-class landscape |
| W4292363386 | 2022 | ML advances for spectroscopic MP analysis | Spectroscopy automation |
| W3036200362 | 2020 | HSI online detection of MPs | HSI + real-time framing |
| W3006187093 | 2020 | Atmospheric MPs review | Peripheral to aquatic RQ |
| W3188289237 | 2021 | Norwegian MP research perspective | Regional policy/science |
| W3110966552 | 2020 | Recycled textiles / circular fashion | Upstream fibers; not aquatic CV |

## Excluded reviews (n=24) — why not in gap-map deep dive

Title/abstract failed aquatic ML-detection inclusion (food, agriculture HSI, smart cities waste, water quality generic, toxicity-only, offshore wind, etc.). Listed in `screening-log.csv` with protocol-aligned reasons. Retained in `papers.jsonl` with `is_review: true` for PRISMA transparency.

Notable excluded IDs: W4401332314 (ML plastic waste detection), W4386170962 (emerging technologies review), W4400653066 (AI marine pollution diversity), W4324091125 (polarimetric imaging DL review).

## Implications for primary evidence (Phase 6+)

| Rule | Detail |
| --- | --- |
| Do not count review papers in `extraction.csv` performance aggregates | Use reviews only to justify gaps |
| Prefer claims from `is_review: false` + `forward_fulltext` | See `claims.jsonl` batches |
| When `slr.md` cites a review | Label as secondary synthesis; cite `paper_id` |
| Overlap | Several reviews duplicate topics covered by primary papers (e.g. μFTIR+ML W4200249418 vs Raman review W4296114416) — reviews set context, primaries supply metrics |

## LATAM / resource-limited deployment (review lens)

| Finding | Evidence |
| --- | --- |
| Few reviews name LATAM study sites | None in obtained FT set; forward set abstract-only |
| Global South authorship appears in risk/HSI reviews | W4210266455 (India), W4318615471 (Malaysia) |
| Reviews emphasize **capital equipment** (Raman, HSI, μFTIR) | W4296114416, W4318615471, W4387344131 (abstract) |
| **In-situ** aquatic detection reviews rising (2024–2025) | W4396828529, W4409887007 — align with deployment gap RQ |
| **Source reduction** (textiles) vs **monitoring** split | W4382940669 vs aquatic CV primaries |

## Recommended follow-ups (not blocking 5.9)

- Obtain OA full text for **W4409887007** when available (central ML-MP review).
- Link each primary `extraction.csv` row to contrasting review claim where applicable (Phase 7).

## Source files

- Inventory: `corpus/structured/papers.jsonl` (`is_review: true`)
- Per-paper summaries (obtained FT only): `corpus/summaries/*.md`
- Screening context: `outputs/knowledge/screening-rationale.md` §A


# Part XIII — Build-in-public thread (reference)


# Build in public — X thread outline (microplastics ML/CV SLR)

**Phase:** 10.11 | **Date:** 2026-05-18  
**Source map:** [slr.md](./slr.md) · [evidence explorer](../products/evidence-explorer/index.html) · [latam-gap-analysis.md](./latam-gap-analysis.md)

**Tone:** Honest systematic map—not hype. Every stat below is traceable to `paper_id` in the corpus. Adjust numbers if you regenerate after search lock.

**Suggested media:** PRISMA screenshot from `prisma-summary.json`, modality chip screenshot from evidence explorer, one table crop from `slr.md` §3.

---

## Thread metadata

| Field | Value |
| --- | --- |
| Suggested title | What ML papers actually show about microplastic detection in water (2019–2025) |
| Hashtags (pick 2–3) | `#OpenScience` `#EnvironmentalML` `#Microplastics` |
| Link in bio / last post | Repo path or published `slr.md` when public |

---

## Post 1 — Hook

**Text (~260 chars):**

> I mapped 228 ML/CV papers on microplastic detection in water (2019–2025). Headline: lab spectroscopy is strong; “AI river sensors” and Colombia field validation are thin. Thread = what procurement teams should *not* assume 🧵

**Alt shorter:**

> Built a systematic map (not a meta-analysis): 228 papers → 31 open full texts → 37 structured extractions. The gap between lab accuracy and field MP monitoring is huge. 🧵

---

## Post 2 — Methods transparency

**Text:**

> How: OpenAlex harvest, 7 queries, search locked 2026-05-18. Screened 228 title/abstract → 116 forward → 31 OA full texts obtained (85 paywalled/OA-failed). This is a **map**, not pooled effect sizes. PRISMA counts in repo.

**Link:** `outputs/products/prisma-summary.json`

---

## Post 3 — Metric trap (precision vs mAP)

**Text:**

> Don’t compare papers on “accuracy” alone. Thailand UV + Faster R-CNN: **precision 85–88%** but **mAP ~34–36%** on field MP boxes (W4391755619). High precision can hide missed small particles. Report **both** detection rate and confirmation.

**Cite:** W4391755619

---

## Post 4 — Wrong task (litter ≠ MPs)

**Text:**

> Many “vision” hits are **macro litter**, not polymer-level microplastics. India river YOLO: **89% mAP** on floating solid waste (W4291123479). Sentinel-2 debris: up to **98%** scenario accuracy (W4380082849). Fine for litter KPIs—**not** MP compliance without relabelling.

**Cite:** W4291123479, W4380082849

---

## Post 5 — Lab spectroscopy wins (with capex)

**Text:**

> Where evidence is strongest: **μFTIR + ML** accuracy **0.9766**, κ **0.9690** on environmental matrices incl. sludge (W4200249418). µ-Raman: precision **≥97%**, recall **≥99%** on 5 polymers (W4366815281). Tradeoff: instruments + expert workflow—not a $10 river sensor.

**Cite:** W4200249418, W4366815281

---

## Post 6 — “Low-cost edge” reality check

**Text:**

> Four extraction rows flagged “edge/low-cost.” Only **one** has open full text we could verify: polarization holographic flow imaging, up to **96%** accuracy (W4391319604)—specialized optics, still lab-centric. Phone YOLO & AI-camera papers? Abstract-only or consumer matrix (W4409162823, W4400418758).

**Cite:** W4391319604, W4409162823, W4400418758

---

## Post 7 — Global South field vs LATAM

**Text:**

> GS **field** vision with trained models: India river YOLO (W4291123479), Thailand UV (W4391755619), Cambodia drone litter (W3091414454), China underwater litter (W3204790372). Useful for **method** transfer—not proof for Colombia/LATAM error rates. `global_south=yes` ≠ field site.

**Cite:** W3091414454, W4291123479

---

## Post 8 — Colombia / LATAM gap (honest)

**Text:**

> LATAM: 25 flagged papers in corpus; **zero** obtained primaries with a **Colombia field programme** for aquatic ML/CV MPs. CO affiliation on a lab Raman study only (W4392657594). Strongest LATAM **field** signal: Brazil beach RS+ML—still **abstract-only** here (W4408220111). Local pilot mandatory.

**Cite:** W4392657594, W4408220111

---

## Post 9 — Tiered monitoring (proposal, not product)

**Text:**

> Defensible architecture for resource-limited coasts/rivers (synthesis, not one paper): (1) macro litter surveillance separate KPI (2) field screen alert (3) spectroscopy hub on subsets. We wrote transferable + **non-transferable** lists for Colombia—no Magdalena-validated product exists yet.

**Link:** `colombia-transferable-methods.md`, `colombia-non-transferable-methods.md`

---

## Post 10 — What we shipped (build in public)

**Text:**

> Artifacts from the factory run: full `slr.md`, 71 claims (`claims.jsonl`), static **evidence explorer** (modality + matrix filters, links to 29 summaries), BibTeX for 228 papers. Built with Ralph loop + notebook repo—scripts regenerate from `papers.jsonl`.

**Link:** `outputs/products/evidence-explorer/index.html`, `outputs/products/README.md`

---

## Post 11 — Limitations (credibility)

**Text:**

> Limits: English/OpenAlex bias; **85** forwards without OA full text; metrics not pooled; reviews cited for gaps not performance. Paywalled LATAM venues likely under-counted. Open questions: Colombia pilot design (Q9), tiered stack validation (Q10).

**Link:** `questions.md`

---

## Post 12 — CTA / close

**Text:**

> If you’re designing MP monitoring in LATAM or a low-resource lab: separate **litter CV** from **polymer ID**, demand **mAP + spectral confirmation**, and budget a **local pilot** before procurement. Questions welcome—I’ll link the map PDF/HTML when published.

**Optional poll:** “Biggest blocker for your context?” → Cost / Field validation / Metrics confusion / Staff training

---

## Optional follow-up posts (not in main thread)

| Topic | One-liner |
| --- | --- |
| MARIDA ≠ MP | Marine debris IoU benchmark (W4205835860) is not an MP detector for regulation. |
| Microfluidic n=5 | W4383534319: >93% lab, **5** field particles—don’t scale to ops from that alone. |
| Aged MPs | Review W4393943493: weathering breaks library-match—tropical relevance for training data. |

---

## Pre-publish checklist

- [ ] Re-run `validate-corpus.sh`; confirm slr.md has no `TBD`
- [ ] Regenerate evidence explorer if corpus changed
- [ ] Replace “228/31/37” if manifest stats drift
- [ ] Add public URL when repo or report is published
- [ ] Do not quote unverified abstract metrics as validated (8 abstract-only extraction rows)


# Part XIV — All evidence claims


| # | Claim | Confidence | Evidence |
| ---: | --- | --- | --- |
| 1 | SMACC achieves 1.4% count error and under 4% classification error on 1–5 mm beach sediment microplastics using Sauvola segmentation plus CNN features. | high | W3003736709 |
| 2 | MP-Net U-Net variant reaches mean F1 0.736 and mean IoU 0.617 on fluorescence microscopy patches of stained microplastics from clam digestate. | high | W4282979647 |
| 3 | Polarization holographic flow-through imaging reports classification accuracy up to 96% for microplastic assessment with Bland-Altman bias SD 0.05935 versus reference counts. | high | W4391319604 |
| 4 | Label-free microfluidic CNN/ResNet models exceed 93% accuracy and AUC 0.98±0.02 for small seawater microplastics after augmentation, with field demo on five trapped particles. | medium | W4383534319 |
| 5 | Subspace KNN ensemble on weathered SloPP-E Raman spectra reaches 93.81% polymer classification accuracy versus 89% baseline. | high | W4362015000 |
| 6 | µ-Raman deep-learning workflow reports ≥97.1% precision and ≥99.4% recall for five polymers (PE, PP, PS, PVC, PET) across 64,000 spectra from 47 German environmental samples. | medium | W4366815281 |
| 7 | Random-forest Raman identification of 24 nanoplastic types reports 98.8% accuracy, 98.5% sensitivity, and 100% specificity in laboratory validation. | high | W4382931577 |
| 8 | μFTIR imaging with random decision forest achieves 0.9766 accuracy and 0.9690 Cohen kappa (Monte Carlo CV) across >20 polymer classes on environmental matrices including sewage sludge. | high | W4200249418 |
| 9 | Sentinel-2 XGBoost floating marine debris detection reaches 98% accuracy in scenario 1 but only 83% on plastic-pixel scenario 2 at 10 m resolution. | medium | W4380082849 |
| 10 | MARIDA benchmark U-Net baseline averages IoU 0.57 for marine debris segmentation on Sentinel-2 with wide per-class IoU spread (0.02–1.0). | medium | W4205835860 |
| 11 | Custom YOLO riverine solid-waste detector reports mAP 89%, F1 0.8, and recall 86% on India urban river imagery. | high | W4291123479 |
| 12 | Faster R-CNN-FPN on UV-excited field microplastic imagery reports 85.5–87.8% precision but only 33.9–35.7% mAP on internal/external tests in Thailand. | high | W4391755619 |
| 13 | EfficientDet BiFPN variants improve marine debris AP by 1.2–2.6 percentage points across scales versus baseline on underwater plastic datasets. | high | W4321194910 |
| 14 | Embeddable Mask R-CNN underwater garbage detector reports 9.6% mAP gain and 5.0% segmentation gain over baseline on marine macro litter (China). | medium | W3204790372 |
| 15 | APLASTIC-Q drone CNN litter mapping in Cambodia reaches up to 83% accuracy for aquatic plastic litter classes on orthomosaics. | medium | W3091414454 |
| 16 | Deep UV Raman plus CNN prototype reports ~97% accuracy for multi-analyte water monitoring including a microplastics pathway, but MP coverage remains prototype-stage. | medium | W3172017684 |
| 17 | Hyperspectral SWIR imaging with hierarchical PLS-DA mapped Po River freshwater microplastics at 1.89–8.22 particles/m³ across four stations. | medium | W4213300830 |
| 18 | No obtained-full-text primary study in this corpus reports a Colombia field site for ML/CV microplastic detection. | medium | W4392657594 |
| 19 | Systematic review of Raman for water-body microplastics cites WWTP MP removal efficiencies from 1.8% to 54.5% depending on treatment level. | low | W4296114416 |
| 20 | Bibliometric review of AI microplastic imaging (WoS 2019–2022) finds imaging-AI research concentrated outside LATAM-authored datasets. | medium | W4313826580 |
| 21 | Aged-microplastics review argues weathered particles break pristine-library assumptions, requiring aged training data for spectroscopy+ML ID. | medium | W4393943493 |
| 22 | Satellite and AUV object-detection papers in this corpus target macro litter or debris, not polymer-level microplastic quantification in water. | medium | W4205835860 |
| 23 | Among extraction-table Global South rows, only W4291123479 (India), W4391755619 (Thailand), W3091414454 (Cambodia), and W3204790372 (China) combine field geography with trained vision models. | medium | W3091414454 |
| 24 | Textile-washing microfiber review positions machine vision as prospective for monitoring but reports no unified aquatic CV benchmark for microplastic release. | medium | W4382940669 |
| 25 | Comprehensive micro/nanoplastic occurrence review with India (Hyderabad) authorship aggregates ID methods across air, soil, water, food, and wastewater without a single deployable aquatic ML detector. | medium | W4210266455 |
| 26 | Marine environmental pollution AI narrative review cites one system distinguishing plastic versus natural materials at 86% but lacks a unified microplastic detection benchmark. | low | W3122508379 |
| 27 | Hyperspectral plastic-waste review (Malaysia) reports cited ResNet accuracies near 88.6% for waste sorting but notes sub-millimetre microplastic precision falls below 80% in cited studies. | low | W4318615471 |
| 28 | Image-based particle metrology primer recommends maximum Feret diameter and shape descriptors for harmonized MP sizing but does not train an end-to-end aquatic ML detector. | high | W4385411640 |
| 29 | Classic versus innovative MP identification review (2021) finds spectroscopy and microscopy dominate automated ID while field aquatic computer-vision pipelines remain sparse. | medium | W3134265767 |
| 30 | Gold-nanoparticle SERS bench study detects 350 nm polystyrene in pure water at 0.6 M aggregation-agent LOD but uses no trained ML classifier. | medium | W4304690559 |
| 31 | Micro/nanoplastic corona review treats surface coronas as key toxicity determinants and provides no field detection ML performance benchmark. | medium | W4290026722 |
| 32 | PRISMA pansharpened Sentinel-2 plastic-index pipeline targets detection down to ~8% pixel plastic coverage with ~86% accuracy in controlled marine experiments. | medium | W3155690422 |
| 33 | Real-time AI-camera study (YOLOv5 + DeepSORT) targets microplastic motion in a laboratory flume; field deployment and full metrics remain unverified pending full text. | low | W4400418758 |
| 34 | Phone-microscope YOLOv5 system cites a ~$10 TinyScope attachment for on-site consumer-product microplastic detection, not aquatic field monitoring. | low | W4409162823 |
| 35 | Underwater YOLACT instance segmentation on TrashCAN-trained litter targets real-time macro plastic sources, not polymer-level microplastics in water. | low | W4385454320 |
| 36 | Preliminary in-situ ocean Raman plus machine-learning prototype reports early plastic classification results but is not yet field-deployed. | low | W4404688861 |
| 37 | São Paulo urban sandy-beach study integrates remote sensing, GNSS, Raman, and Random Forest/Gradient Boosting to predict microplastic deposits of 6–35 per m² (abstract-level). | low | W4408220111 |
| 38 | Mexico laboratory FTIR study compares k-NN, SVM, RF, CNN, and MLP for six-polymer microplastic classification with normalization-method sensitivity (metrics unverified in corpus). | low | W4408550134 |
| 39 | Brazil UFSC-authored ocean microplastics vibrational-spectroscopy ML paper is forwarded for full text but lacks harvest abstract and verified metrics in this corpus. | low | W3196128465 |
| 40 | Colombia-affiliated Raman high-frequency-noise study examines ML polymer classification tradeoffs in laboratory MP ID without an aquatic field site in available text. | low | W4392657594 |
| 41 | Eight extraction rows are abstract-only forwards with edge or LATAM flags but no obtained PDF, limiting high-confidence deployment claims for those papers. | medium | W4400418758 |
| 42 | O-PTIR with simultaneous Raman on nine reference polymers shows inter-system spectral reproducibility tradeoffs versus conventional FTIR and Raman microspectroscopy for microplastic ID. | medium | W4308496878 |
| 43 | Source comparison in O-PTIR study cites prior work where automated visual inspection identified only 1.4% of suspected particles correctly as synthetic polymers versus manual Raman confirmation. | medium | W4308496878 |
| 44 | Fish-intestine hyperspectral imaging with support-vector-machine classification reports ~6-minute total workflow (1 min acquisition, 5 min analysis) versus digestion-heavy Raman/FTIR protocols in abst… | low | W2936115560 |
| 45 | Expert survey proposes nine harmonized shape classes for hyperspectral-imaged microplastics validated on 11,042 particles across indoor air, wastewater, marine water, stormwater, and sediment matrices… | medium | W4385737119 |
| 46 | Deep-learning shape-classification study tests nine architectures (VGG16, ResNet50, MobileNet, custom CNNs) on hyperspectral images of 11,042 environmental microplastics to replace expert labeling. | low | W4414305742 |
| 47 | 2025 systematic review on machine-learning advancements for microplastic and nanoplastic detection emphasizes ML plus spectroscopy to address low resolution, large data volumes, and long imaging times… | medium | W4409887007 |
| 48 | 2024 in-situ aquatic microplastic detection systematic review cites lack of standardisation, limited spatiotemporal coverage, high cost, and slow lab procedures as drivers for field monitoring technol… | medium | W4396828529 |
| 49 | 2024 practitioner primer on machine-learning tools for microplastics in soil, river water, and biosolids argues ML can reduce extraction labor and increase analysis speed versus manual counting. | medium | W4404459247 |
| 50 | Infrared and near-infrared microspectroscopy reviews couple multivariate and machine-learning algorithms to quantify microplastic exposure in drinking water, dust, food, and air matrices. | medium | W4387344131 |
| 51 | Machine-learning meta-analysis of global marine microplastic polymer composition is authored from Cochin University of Science and Technology (India) but full text was not obtained in this corpus. | low | W4292417018 |
| 52 | μFTIR random-decision-forest classifier reports PP sensitivity 0.957 and PVC sensitivity 1.000 alongside overall accuracy 0.9766 on multi-matrix environmental samples. | high | W4200249418 |
| 53 | APLASTIC-Q PLQ-CNN quantification accuracy is 60–71% versus up to 83% for PLD-CNN litter detection on Cambodia drone orthomosaics. | medium | W3091414454 |
| 54 | MP-Net U-Net1 variant achieves mean recall 0.883 on fluorescence microscopy patches, highest recall among compared segmentation architectures. | high | W4282979647 |
| 55 | German µ-Raman environmental workflow found polymers in 10.7% of 47 catchment samples across 64,000 spectra with five-class polymer coverage only. | medium | W4366815281 |
| 56 | Among 34 modality groups in extraction-by-modality.csv, object-detection RGB variants (five groups) outnumber dedicated in-situ aquatic microplastic CV pipelines. | medium | W4321194910 |
| 57 | Only one of four extraction rows flagged edge_low_cost=yes has obtained open-access full text in this corpus (polarization holographic W4391319604). | medium | W4391319604 |
| 58 | PRISMA full-text stage included 31 papers and excluded 85 forwards where OA PDF or HTML full text could not be retrieved under project access rules. | high | W4200249418 |
| 59 | Peripheral structural-metals sustainability review (2019) was included at full text for screening transparency but provides no microplastic or aquatic computer-vision detection evidence. | high | W2982912960 |
| 60 | High-throughput Mediterranean FTIR-plus-ML microplastic identification study (179 citations) is forwarded but remains paywalled with null abstract in OpenAlex harvest. | low | W2952839204 |
| 61 | Twelve extraction rows mark global_south=yes, yet obtained-full-text primaries with explicit Global South field sites remain limited to India, Thailand, Cambodia, and China vision papers. | medium | W4291123479 |
| 62 | Phase 9.1 LATAM scan identified 25 papers with LATAM-specific signals among 228 harvested works (~11% of corpus), stored in latam-papers.jsonl. | high | W4408220111 |
| 63 | Only one LATAM-subset paper (São Paulo urban sandy beaches) is classified as a field site for MP-related remote sensing plus ML, and its full text was not obtained in this corpus. | medium | W4408220111 |
| 64 | Colombia appears in the LATAM subset via author affiliation on a laboratory Raman ML study without an obtained full text or documented Colombian aquatic field sampling site. | medium | W4392657594 |
| 65 | latam-papers.jsonl excludes South and Southeast Asian field-vision primaries that carry only global_south extraction flags and no LATAM geography signal. | high | W4291123479 |
| 66 | Four extraction rows combine global_south=yes with LATAM priority or country metadata (W4392657594, W4408220111, W4408550134, W3196128465); overlap does not imply Colombian field validation. | medium | W4408550134 |
| 67 | Among global_south=yes extraction rows, Cambodia drone litter mapping (APLASTIC-Q) has OECD-only OpenAlex author countries while field geography is Southeast Asia—affiliation tags mislead LATAM procur… | medium | W3091414454 |
| 68 | No obtained full text in the LATAM subset documents a Mexican river, coastal, or WWTP microplastic computer-vision field campaign in this map. | medium | W4408550134 |
| 69 | Sentinel-2 floating-debris scenario accuracy up to 98% addresses macro debris pixels, not polymer-level microplastic limits suitable for Magdalena or Caribbean MP regulation without KPI relabelling. | medium | W4380082849 |
| 70 | Brazil leads the LATAM subset with eight OpenAlex country-tagged papers but only one names a Brazilian field site for MP-related ML, and that study lacks verified open full-text metrics. | medium | W3196128465 |
| 71 | Magdalena River and Colombian Caribbean coastal matrices have no obtained-primary aquatic ML/CV microplastic field programme in this corpus; monitoring design remains transfer-based (Phase 9.6–9.7). | medium | W4392657594 |

**Total claims:** 71


# Part XV — Structured extraction table


| paper_id | openalex_id | doi | title | year | modality | model_type | matrix | scale | metrics | dataset_size | open_data | edge_low_cost | global_south | limitation_author | screening_status | exclude_reason | notes |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| W3003736709 | https://openalex.org/W3003736709 | 10.1109/access.2020.2970498 | SMACC: A System for Microplastics Automatic Counting and Classification | 2020 | microscopy_lab_imaging | Sauvola_segmentation; classical_features; CNN_variant | beach_sediment_lab | 1-5 mm particles | "count_error_1.4%; classification_error_<4% (except line class)" | 12 samples; 2507 particles | unverified | no | no | "not in-situ aquatic; line-shape class weakness; requires cleaned lab samples" | forward_fulltext |  | Phase 6.1 microscopy+CV batch |
| W4282979647 | https://openalex.org/W4282979647 | 10.1371/journal.pone.0269449 | MP-Net: Deep learning-based segmentation for fluorescence microscopy images of microplastics isolated from clams | 2022 | fluorescence_microscopy | U-Net; FCN; DeepLabv3; Nested_U-Net | clam_digestate_lab | fluorescent MP in microscopy (stained) | mF1_0.736; mIoU_0.617 (U-Net4 best); mRecall_0.883 (U-Net1) | 99 images; >=20000 patches | unverified | no | no | "lab staining workflow; bivalve matrix; not in-situ water imaging" | forward_fulltext |  | Phase 6.1 microscopy+CV batch |
| W4391319604 | https://openalex.org/W4391319604 | 10.1038/s41598-024-52762-5 | High-throughput microplastic assessment using polarization holographic imaging | 2024 | polarization_holographic_microscopy | flow-through imaging + classification pipeline | flow_microfluidic_lab | 20 um spatial resolution (calibrated) | classification_accuracy_up_to_96%; Bland-Altman_bias_SD_0.05935 | unverified | unverified | yes | no | "portable/cost-effective claimed; specialized optical hardware" | forward_fulltext |  | Phase 6.1 microscopy+CV batch; edge_low_cost updated 6.9 |
| W4383534319 | https://openalex.org/W4383534319 | 10.1038/s41598-023-37900-9 | A microfluidic approach for label-free identification of small-sized microplastics in seawater | 2023 | microfluidic_optical_imaging | CNN; ResNet34; SVM; RF | seawater_surface; lab validation | small MPs (approx 10-45 um per source methods) | accuracy_>93% (CNN/ResNet augmented); AUC_0.98±0.02 | field seawater demo n=5 trapped particles; training set unverified | unverified | no | no | "microfluidic fab skill; lab classification step after field collection" | forward_fulltext |  | Phase 6.1 microscopy+CV batch |
| W4362015000 | https://openalex.org/W4362015000 | 10.54364/aaiml.2023.1144 | Machine Learning of polymer types from the spectral signature of Raman spectroscopy microplastics data | 2023 | raman_spectroscopy | subspace_KNN_ensemble; SVM; RF; ANN | weathered_MP_lab | SloPP-E weathered particles (22 polymers) | accuracy_93.81% (SloPP-E test; up from 89% baseline) | 97 SloPP-E samples; training spectra unverified | unverified | no | no | "small weathered test set; lab Raman only" | forward_fulltext |  | Phase 6.2 Raman+ML batch |
| W4366815281 | https://openalex.org/W4366815281 | 10.1186/s43591-023-00057-3 | Development of a machine learning-based method for the analysis of microplastics in environmental samples using µ-Raman | 2023 | raman_microspectroscopy | deep_learning_ReLU (single_model; per_class_models) | environmental_wastewater | Germany catchment samples | precision_>=97.1%; recall_>=99.4% (5 polymers PE PP PS PVC PET) | 64000 spectra; 47 samples; 10.7% polymer | unverified | unverified | no | no | "five polymer classes only; high instrument cost" | forward_fulltext |  | Phase 6.2 Raman+ML batch |
| W4382931577 | https://openalex.org/W4382931577 | 10.1021/acs.est.3c03210 | Automatic Identification of Individual Nanoplastics by Raman Spectroscopy Based on Machine Learning | 2023 | raman_spectroscopy | random_forest | spiked_tap_water; environmental_lab | nanoplastics (24 types) | accuracy_98.8%; sensitivity_98.5%; specificity_100%; spiked_water_>97% | 24 polymer types; environmental validation unverified | unverified | no | no | "nanoplastic lab workflow; substrate prep" | forward_fulltext |  | Phase 6.2 Raman+ML batch |
| W3172017684 | https://openalex.org/W3172017684 | 10.3390/s21113911 | Application of Laser-Induced Deep UV Raman Spectroscopy and Artificial Intelligence in Real-Time Environmental Monitoring | 2021 | raman_spectroscopy_in_situ | CNN | water_monitoring_prototype | aquatic pollutants (MP pathway cited) | CNN_accuracy_~97% (multi-analyte model) | unverified | unverified | unverified | no | "prototype; MP coverage unverified; deep UV Raman capex" | forward_fulltext |  | Phase 6.2 Raman+ML batch |
| W4213300830 | https://openalex.org/W4213300830 | 10.1007/s11356-022-18501-x | Classification and distribution of freshwater microplastics along the Italian Po river by hyperspectral imaging | 2022 | hyperspectral_imaging_SWIR | hierarchical_PLS-DA (HI-PLS-DA) | freshwater_river | Po River Italy; SWIR 1000-2500nm | unverified (HI-PLS-DA per-level metrics in source tables) | 4 river stations; 1.89-8.22 particles/m3 | unverified | no | no | "HSI capital cost; European river only" | forward_fulltext |  | Phase 6.3 hyperspectral batch |
| W4380082849 | https://openalex.org/W4380082849 | 10.1109/tgrs.2023.3283607 | Automatic Detection and Identification of Floating Marine Debris Using Multispectral Satellite Imagery | 2023 | multispectral_satellite_S2 | XGBoost | marine_coastal | floating debris (macro) | accuracy_98% (scenario1); accuracy_83% (scenario2 plastic pixels); accuracy_75% (ensemble quantification) | training sites Greece and published corpora | unverified | no | no | "10m S2 resolution; not microplastic size class" | forward_fulltext |  | Phase 6.3 hyperspectral batch |
| W4205835860 | https://openalex.org/W4205835860 | 10.1371/journal.pone.0262247 | MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data | 2022 | multispectral_satellite_S2 | U-Net; RF; SS+SI baselines | marine_coastal | marine debris segmentation | avg_IoU_0.57 (U-Net baseline per Table 4); per-class IoU 0.02-1.0 | MARIDA annotated S2 scenes | unverified | no | no | "macro debris labels; cloud gaps" | forward_fulltext |  | Phase 6.3 hyperspectral batch |
| W3155690422 | https://openalex.org/W3155690422 | 10.1109/access.2021.3073903 | Pansharpening PRISMA Data for Marine Plastic Litter Detection Using Plastic Indexes | 2021 | hyperspectral_PRISMA_pansharpened_S2 | plastic_indexes; pansharpening pipeline | marine_coastal_lab_simulation | targets down to ~8% pixel coverage | unverified (detection_8%_pixel_coverage; accuracy_86%_cited_configuration) | controlled plastic target experiments | unverified | no | no | "satellite preprocessing heavy; not field portable" | forward_fulltext |  | Phase 6.3 hyperspectral batch |
| W4200249418 | https://openalex.org/W4200249418 | 10.1021/acs.estlett.1c00851 | Computer-Assisted Analysis of Microplastics in Environmental Samples Based on μFTIR Imaging in Combination with Machine Learning | 2021 | ftir_microspectroscopy_imaging | random_decision_forest (RDF) | water; sediment; soil; compost; sewage_sludge; air; sea_salt | FPA-μFTIR imaging (>20 polymer classes) | accuracy_0.9766; kappa_0.9690 (Monte Carlo CV); PP_sensitivity_0.957; PVC_sensitivity_1.000 | ~12000 reference spectra; 8 application matrices | unverified | no | no | "FPA-μFTIR lab only; expert dual-control; commercial Purency software" | forward_fulltext |  | Phase 6.4 FTIR+ML batch |
| W4291123479 | https://openalex.org/W4291123479 | 10.3389/fpubh.2022.907280 | An automated solid waste detection using the optimized YOLO model for riverine management | 2022 | rgb_object_detection | YOLO (custom CSPDarknet+SPP+Mish+DIoU-NMS) | urban_river | macro solid waste floating | mAP_89.0%; F1_0.8; recall_86% | 9554 train / 2481 test images (per source) | unverified | no | yes | "macro litter not MP; robot E2E unverified" | forward_fulltext |  | Phase 6.5 YOLO/OD batch |
| W4391755619 | https://openalex.org/W4391755619 | 10.1038/s41598-024-53251-5 | A new approach to classifying polymer type of microplastics based on Faster-RCNN-FPN and spectroscopic imagery under ultraviolet light | 2024 | rgb_object_detection_uv | Faster_R-CNN ResNet50-FPN | coastal_marine_field | MP polymer boxes (UV imagery) | precision_85.5-87.8%; mAP_33.9% internal; mAP_35.7% external | field UV images; splits unverified | unverified | yes | yes | "low mAP on small objects; UV capture protocol needed; author low-cost claim" | forward_fulltext |  | Phase 6.5 YOLO/OD batch; edge_low_cost updated 6.9 |
| W4321194910 | https://openalex.org/W4321194910 | 10.1109/lra.2023.3245405 | Towards More Efficient EfficientDets and Real-Time Marine Debris Detection | 2023 | rgb_object_detection_auv | EfficientDet (BiFPN variants) | marine_underwater | macro marine debris | mAP_AP_gain_+1.5% D0; +2.6% D1; +1.2% D2; +1.3% D3 (vs baseline) | PASCAL VOC + custom in-water plastic dataset | unverified | unverified | no | no | "AUV simulator stage; macro debris not MPs" | forward_fulltext |  | Phase 6.5 YOLO/OD batch |
| W3204790372 | https://openalex.org/W3204790372 | 10.3390/s21196391 | An Embeddable Algorithm for Automatic Garbage Detection Based on Complex Marine Environment | 2021 | rgb_object_detection_underwater | Mask_R-CNN (improved FPN+attention) | underwater_marine | macro garbage | unverified (mAP_+9.6% vs baseline; segmentation_+5.0%) | unverified | unverified | yes | yes | "underwater low resolution; Mask R-CNN not YOLO; embeddable for AUV" | forward_fulltext |  | Phase 6.5 YOLO/OD batch; edge_low_cost updated 6.9 |
| W4296114416 | https://openalex.org/W4296114416 | 10.1007/s13762-022-04505-0 | Raman spectroscopy for microplastic detection in water sources: a systematic review | 2022 | review_synthesis_spectroscopy | systematic_review | freshwater_marine_sewage | water_body_RS_applications | unverified (WWTP_removal_1.8-54.5%_cited; advanced_treatment_~88.6%_cited; pretreatment_78±8%_cited) | literature_synthesis | unverified | unverified | no | "2022 search window; RS cost; heterogeneous primary quality" | forward_fulltext |  | Phase 6.6 wastewater; review not primary benchmark |
| W4382940669 | https://openalex.org/W4382940669 | 10.3390/toxics11070575 | Environmentally Friendly Approach to the Reduction of Microplastics during Domestic Washing | 2023 | review_textile_washing | narrative_review | domestic_wastewater_textile | microfiber_release_mitigation | unverified (fiber-release stats cited; no unified ML metric) | literature_synthesis | unverified | unverified | no | "upstream prevention; machine vision prospective only" | forward_fulltext |  | Phase 6.6 wastewater; review not primary benchmark |
| W4210266455 | https://openalex.org/W4210266455 | 10.1007/s11157-021-09609-6 | Micro/nano-plastics occurrence identification risk analysis and mitigation | 2022 | review_occurrence_mitigation | comprehensive_review | air_soil_water_food_wastewater_pathways | MNP_occurrence_and_ID_survey | unverified (aggregated primary findings) | literature_synthesis | unverified | unverified | yes | "broad MNP scope beyond aquatic CV; 2022 cutoff" | forward_fulltext |  | Phase 6.6 wastewater; review not primary benchmark; India affiliation |
| W4393943493 | https://openalex.org/W4393943493 | 10.1007/s10311-024-01731-5 | Analysis of aged microplastics: a review | 2024 | review_aged_MP_analytics | systematic_review | environmental_wastewater_sludge_cited | aged_MP_ID_and_chemometrics | unverified (cited_μFTIR_ML_>97%_literature_aggregate) | literature_synthesis | unverified | unverified | no | "not primary empirical; tropical weathering gap" | forward_fulltext |  | Phase 6.6 wastewater; review not primary benchmark |
| W3091414454 | https://openalex.org/W3091414454 | 10.1088/1748-9326/abbd01 | Machine learning for aquatic plastic litter detection classification and quantification | 2020 | rgb_drone_aerial | PLD-CNN; PLQ-CNN | river_beach_cambodia_field | macro aquatic plastic litter | accuracy_up_to_83% (PLD-CNN); PLQ-CNN_60-71%; baseline_83%_cited | DJI Phantom 4 Pro Cambodia Oct 2019 | unverified | no | yes | "drone mosaic dependency; class imbalance; macro not MP" | forward_fulltext |  | Phase 6.7 marine/coastal field |
| W3122508379 | https://openalex.org/W3122508379 | 10.33175/mtr.2021.248053 | Managing Marine Environmental Pollution using Artificial Intelligence | 2021 | review_synthesis_marine | narrative_review | marine_pollution_monitoring | marine litter and cleanup robotics | unverified (plastic_vs_natural_86%_cited_one_system) | literature_synthesis | unverified | unverified | yes | "broad scope beyond MPs; no unified MP benchmark" | forward_fulltext |  | Phase 6.7 marine/coastal; review not primary benchmark; India |
| W4318615471 | https://openalex.org/W4318615471 | 10.11591/ijece.v13i3.pp3407-3419 | A review of hyperspectral imaging-based plastic waste detection state-of-the-art | 2023 | review_hsi_plastic_waste | literature_review | marine_coastal_environmental_HSI | plastic waste sorting and environment | unverified (ResNet_88.6%_cited; HSI_PP_HDPE_>90%_sensitivity_cited; submm_precision_<80%_cited) | literature_synthesis | unverified | unverified | yes | "waste focus; sensor cost; sub-mm MP weak" | forward_fulltext |  | Phase 6.7 marine/coastal; review not primary benchmark; Malaysia |
| W4385411640 | https://openalex.org/W4385411640 | 10.1021/acs.est.3c01243 | A practical primer for image-based particle measurements in microplastic research | 2023 | methods_primer_image_metrology | harmonized_workflow (no trained classifier) | marine_freshwater_MP_imaging_context | particle size and shape descriptors | unverified (metrology accuracy concepts only) | literature_synthesis | unverified | unverified | no | "no ML detector; harmonization for CV output metrics" | forward_fulltext |  | Phase 6.7 marine/coastal field metrology primer |
| W3134265767 | https://openalex.org/W3134265767 | 10.3389/ftox.2021.636640 | Micro and Nanoplastics Identification Classic Methods and Innovative Detection Techniques | 2021 | review_synthesis_ID_methods | structured_review | freshwater_river_lake_sediment_cited | aquatic and terrestrial ID survey | unverified (per-method accuracies from cited primaries) | literature_synthesis | unverified | unverified | no | "2021 cutoff; ID not deployment; lake/river cases cited" | forward_fulltext |  | Phase 6.8 freshwater/river; review not primary benchmark |
| W4304690559 | https://openalex.org/W4304690559 | 10.1016/j.apsusc.2022.155239 | Surface-enhanced Raman spectroscopy for the detection of microplastics in water | 2022 | raman_SERS_bench | SERS_substrate_optimization (no ML classifier) | pure_water_lab | aquatic microparticles PS PE | LOD_0.6M_aggregation_agent_PS; PE_signal_weak | PS 350nm; PE 1-4um bench | unverified | no | no | "pure water bench only; no trained ML; substrate fab" | forward_fulltext |  | Phase 6.8 freshwater/river; bench aquatic water not field river |
| W4313826580 | https://openalex.org/W4313826580 | 10.3390/ijerph20021150 | A Critical Review on Artificial Intelligence-Based Microplastics Imaging Technology | 2023 | review_bibliometric_imaging | bibliometric_science_mapping | global_imaging_AI_corpus (WoS 2019-2022) | MP imaging research trends | unverified (cluster metrics from WoS figures) | literature_synthesis | unverified | unverified | yes | "WoS ends 2022; spectroscopy-only ML underrepresented" | forward_fulltext |  | Phase 6.8 freshwater/river gap-map; review not primary benchmark; China |
| W4290026722 | https://openalex.org/W4290026722 | 10.1186/s12989-022-00492-9 | Coronas of micro/nano plastics a key determinant in their risk assessments | 2022 | review_corona_risk | narrative_review | freshwater_marine_soil_aquatic | MNP surface coronas and toxicity | unverified (no detection ML benchmark) | literature_synthesis | unverified | unverified | no | "risk/toxicity not field detection; informs spectroscopy interpretation" | forward_fulltext |  | Phase 6.8 freshwater/river; review not primary benchmark |
| W4400418758 | https://openalex.org/W4400418758 | 10.3390/s24134394 | Real-Time Detection of Microplastics Using an AI Camera | 2024 | rgb_ai_camera_edge | YOLOv5; DeepSORT | laboratory_flume | MP in motion size and velocity | unverified (precision in abstract truncated) | unverified | unverified | yes | no | "real-time AI camera; lab flume not field; FT pending" | forward_fulltext |  | Phase 6.9 edge/low-cost; abstract-only extraction |
| W4409162823 | https://openalex.org/W4409162823 | 10.1039/d4ra07991d | Deep-learning enabled rapid and low-cost detection of microplastics in consumer products | 2025 | phone_microscopy_rgb | YOLOv5 | consumer_product_on_site_extraction | TinyScope phone attachment ($10 cited) | unverified (YOLO splits 1990/250/250 cited) | unverified | unverified | yes | no | "low-cost phone microscope; consumer matrix not aquatic field" | forward_fulltext |  | Phase 6.9 edge/low-cost; abstract-only extraction |
| W4385454320 | https://openalex.org/W4385454320 | 10.3390/jmse11081532 | Real-Time Instance Segmentation for Detection of Underwater Litter as a Plastic Source | 2023 | rgb_instance_segmentation_underwater | YOLACT; Mask_R-CNN | underwater_marine_litter | macro underwater litter | unverified (TrashCAN trained; metrics truncated in harvest) | unverified | unverified | unverified | no | "real-time segmentation claim; macro litter not MPs; FT pending" | forward_fulltext |  | Phase 6.9 edge/low-cost; abstract-only extraction |
| W4404688861 | https://openalex.org/W4404688861 | 10.1109/oceans55160.2024.10754063 | Toward in Situ Identification of Microplastics in Water Using Raman Spectroscopy and Machine Learning | 2024 | raman_in_situ_prototype | ML_plastic_classification (preliminary) | ocean_in_situ_proposed | multi-sensor in situ MP ID | unverified (preliminary Raman+ML results) | unverified | unverified | unverified | no | "prototype in-situ system; not deployed; FT pending" | forward_fulltext |  | Phase 6.9 edge/low-cost; abstract-only extraction |
| W4392657594 | https://openalex.org/W4392657594 | 10.1177/00037028241233304 | A Study of High-Frequency Noise for Microplastics Classification Using Raman | 2024 | raman_spectroscopy | ML_classification (Raman peaks; noise tradeoff study) | laboratory_MP_ID | MP polymer classification from Raman | unverified | unverified | unverified | no | yes | "priority_latam; author country CO; FT pending" | forward_fulltext |  | Phase 6.10 LATAM/GS; abstract-only; Colombia affiliation |
| W4408220111 | https://openalex.org/W4408220111 | 10.3390/microplastics4010012 | Microplastic Deposits Prediction on Urban Sandy Beaches Integrating Remote Sensing and ML | 2025 | remote_sensing_GNSS_Raman_ML | Random_Forest; Gradient_Boosting | Sao_Paulo_coast_Brazil_field | urban beach MP deposition | unverified (MP_6-35_per_m2 cited in abstract; RF_GB accuracy cited) | unverified | unverified | unverified | yes | "priority_latam; Brazil field site; latam_flag=yes" | forward_fulltext |  | Phase 6.10 LATAM/GS; abstract-only extraction |
| W4408550134 | https://openalex.org/W4408550134 | 10.3390/recycling10020046 | FTIR-Based Microplastic Classification ML and DL Techniques | 2025 | ftir_spectroscopy | k-NN; SVM; NB; RF; CNN; MLP | laboratory_six_polymers | industrial polymer FTIR classes | unverified (normalization comparison study) | unverified | unverified | unverified | yes | "priority_latam; country MX; FT pending" | forward_fulltext |  | Phase 6.10 LATAM/GS; abstract-only; Mexico |
| W3196128465 | https://openalex.org/W3196128465 | 10.1016/j.chemosphere.2021.131903 | Training and evaluating machine learning algorithms for ocean microplastics classification through vibrational spectroscopy | 2021 | vibrational_spectroscopy | ML_algorithms (FTIR/Raman per title) | ocean_microplastics | Brazil_UFSC_authors | unverified | unverified | unverified | no | yes | "priority_latam; Brazil UFSC; no abstract in harvest; FT pending" | forward_fulltext |  | Phase 6.10 LATAM/GS; abstract-only; Brazil |


# Part XVI — PRISMA summary (JSON)


```json
{
  "project": "microplastics-ml-detection-slr",
  "generated_at": "2026-05-18",
  "search_lock_date": "2026-05-18",
  "openalex_mailto": "dsilgadosalcedo@users.noreply.github.com",
  "queries": [
    "microplastic detection machine learning CNN",
    "microplastic computer vision deep learning water",
    "microplastic Raman spectroscopy machine learning",
    "microplastic hyperspectral imaging classification",
    "microplastic YOLO detection"
  ],
  "synthesis_type": "systematic_map",
  "stats": {
    "identified": 228,
    "after_dedupe": 228,
    "screened_title_abstract": 228,
    "excluded_title_abstract": 112,
    "forward_fulltext": 116,
    "forward_fulltext_target_met": true,
    "full_text_sought": 116,
    "full_text_obtained": 31,
    "oa_pdfs_in_raw": 25,
    "oa_pdfs_target_met": true,
    "full_text_not_retrieved": 85,
    "full_text_pending_retrieval": 0,
    "included": 31,
    "excluded_full_text": 85,
    "full_text_html_obtained": 6,
    "full_text_harvest_complete": true,
    "summaries_md": 31,
    "summaries_target_met": true,
    "summaries_scope": "all fulltext_obtained papers",
    "extraction_rows": 37,
    "extraction_target_met": true,
    "extraction_primary_rows": 28,
    "extraction_review_rows": 9,
    "extraction_abstract_only_rows": 8,
    "extraction_global_south_yes": 12
  },
  "targets_met": {
    "forward_fulltext": true,
    "oa_pdfs": true,
    "summaries": true,
    "extraction": true,
    "full_text_harvest": true
  },
  "flow": [
    {
      "phase": "identification",
      "label": "Records identified (OpenAlex)",
      "n": 228,
      "manifest_key": "identified"
    },
    {
      "phase": "identification",
      "label": "Records after deduplication",
      "n": 228,
      "manifest_key": "after_dedupe"
    },
    {
      "phase": "screening",
      "label": "Records screened (title/abstract)",
      "n": 228,
      "manifest_key": "screened_title_abstract"
    },
    {
      "phase": "screening",
      "label": "Records excluded (title/abstract)",
      "n": 112,
      "manifest_key": "excluded_title_abstract"
    },
    {
      "phase": "screening",
      "label": "Records forwarded to full text",
      "n": 116,
      "manifest_key": "forward_fulltext"
    },
    {
      "phase": "eligibility",
      "label": "Full-text articles sought",
      "n": 116,
      "manifest_key": "full_text_sought"
    },
    {
      "phase": "eligibility",
      "label": "Full-text articles obtained",
      "n": 31,
      "manifest_key": "full_text_obtained"
    },
    {
      "phase": "eligibility",
      "label": "OA PDFs in corpus/raw/",
      "n": 25,
      "manifest_key": "oa_pdfs_in_raw"
    },
    {
      "phase": "eligibility",
      "label": "Full-text HTML/normalized",
      "n": 6,
      "manifest_key": "full_text_html_obtained"
    },
    {
      "phase": "eligibility",
      "label": "Full-text articles not retrieved",
      "n": 85,
      "manifest_key": "full_text_not_retrieved"
    },
    {
      "phase": "included",
      "label": "Studies summarised (all obtained FT)",
      "n": 31,
      "manifest_key": "included"
    },
    {
      "phase": "included",
      "label": "Structured extraction rows",
      "n": 37,
      "manifest_key": "extraction_rows"
    }
  ],
  "breakdown": {
    "fulltext_not_retrieved_by_reason": {
      "oa_download_failed": 15,
      "oa_not_retrieved": 40,
      "paywalled": 30
    },
    "included_fulltext_by_type": {
      "primary_articles": 22,
      "reviews": 9,
      "total": 31
    },
    "extraction": {
      "total_rows": 37,
      "primary_rows": 28,
      "review_rows": 9,
      "abstract_only_rows": 8,
      "global_south_yes": 12
    },
    "latam_subset": 25,
    "claims_jsonl": 71
  },
  "sources": {
    "manifest": "manifest.json",
    "prisma_flow_md": "corpus/structured/prisma-flow.md",
    "papers_jsonl": "corpus/structured/papers.jsonl"
  }
}

```


# Part XVII — Discovery deferred


# Discovery deferred

Items moved here when closing the factory without executing them. Each entry: original ID, title, rationale for deferral.

## BUILD-001

**Title:** OpenAlex harvest CLI with resume + pagination

**Original rationale:** Query 1 has 842 hits; spine 1.6–1.7 need reliable paging without duplicate paper_ids.

**Deferred:** 2026-05-18 (iteration 148)

**Reason:** Search lock complete with 228 unique works via manual/agent harvest (Phases 1.1–1.12). CLI is post-factory maintenance for corpus refresh or query expansion, not required for SLR deliverables at lock date.

**Suggested follow-up:** `tools/openalex-harvest-cli/` if re-harvesting after protocol amendment.

## BUILD-002

**Title:** OA PDF batch downloader (Unpaywall + OpenAlex pdf_url, sha256)

**Original rationale:** Phase 4.1 hit Elsevier/IEEE 403 on first 8 priority IDs; script needed for batches 4.2–4.3.

**Deferred:** 2026-05-18 (iteration 149)

**Reason:** Full-text targets met at search lock (25 OA PDFs ≥20; 31 FT obtained total). Paywalled and publisher-blocked IDs are documented as `fulltext_unavailable` in jsonl/manifest. Batch downloader is optional automation for future corpus refresh, not a factory deliverable.

**Suggested follow-up:** `tools/oa-pdf-downloader/` with rate limits and publisher-specific fallbacks.

## CORPUS-001

**Title:** Obtain OA PDFs for WWTP hyperspectral primaries

**Original rationale:** Spine 6.6 deferred W4385737119, W4414305742, W4409917643 (`forward_fulltext`, `fulltext_unavailable`); needed for primary extraction rows.

**Deferred:** 2026-05-18 (iteration 150)

**Reason:** All three remain paywalled or publisher-blocked after Phase 4 harvest. WWTP cluster is covered in `extraction.csv` via review rows and abstract-only documentation; gap noted in `slr.md` §3.3 and `questions.md`. Obtaining PDFs would require institutional access outside factory scope at search lock.

**Paper IDs:** W4385737119, W4414305742, W4409917643

**Suggested follow-up:** Manual download if institutional access becomes available; add primary extraction row + summary if FT obtained.


# Appendix — Included paper summaries index


_Summaries live under `corpus/summaries/` in the research repo. Obtained full texts (n=31) were synthesised in Part I._