llm-tools/mcps/dicom_mcp/docs/TODO.md
Gregory Gauthier 83ec950df7 first commit
2026-04-08 12:11:04 +01:00

5.0 KiB
Raw Blame History

DICOM MCP Server — TODO

Two remaining items. The original four-item enhancement plan (2026-02-11) has been completed — dicom_query and dicom_search were implemented, along with four additional tools (dicom_dump_tree, dicom_compare_uids, dicom_verify_segmentations, dicom_analyze_ti). The items below are follow-on improvements identified during testing.


1. Private Tag Exploration Tool (dicom_private_tags)

Priority: High — this is the trickiest to get right but the most valuable for cross-manufacturer QA.

Problem

Manufacturers store critical acquisition parameters in private (vendor-specific) DICOM tags rather than standard public tags. This causes real issues in multi-vendor QA workflows:

  • Philips TE=0 quirk: The Erasmus Achieva 1.5T dataset shows EchoTime = 0 for several Dixon series (Body mDixon THRIVE, Thigh Dixon Volume). The actual multi-point Dixon echo times are stored in Philips private tags, not the standard (0018,0081) EchoTime field. This was confirmed via dicom_query grouped by SeriesDescription on the Erasmus dataset — 328 of 648 Thigh Dixon files and 100 of 157 mDixon THRIVE files report TE=0.
  • Manufacturer encoding differences: Siemens stores echo times in public tags normally (e.g. MOST series TEs of 2.3819.06 ms on Avanto_fit). Philips MOST TEs (2.37118.961 ms) are in public tags too, but Dixon TEs are hidden in private tags. GE embeds Dixon image type info in ImageType fields rather than series descriptions.
  • Nested sequences and binary blobs: Philips private tags frequently contain nested DICOM sequences, and some values are only interpretable if you know the specific software version. Binary data needs special handling to avoid dumping unreadable content.

Discussion Notes

From our initial conversation, we decided not to implement this tool immediately because:

  1. Deciphering some private tags requires conditional logic based on the contents of certain public tags (or other private tags). The exact rules are manufacturer-specific and need to be rediscovered through hands-on exploration.
  2. Building the wrong abstraction would be worse than no abstraction — we need to tinker with real data first before committing to a tool design.

Proposed Design (Single tool with three modes)

discover mode — Scan a file and list all private tag blocks with their creator strings. Answers "what vendor modules are present?" Output: group number, creator string, tag count per block.

dump mode — Show all private tags within a specific creator block (or all private tags in a file). For each tag: hex address, creator, VR, value. Binary values show first N bytes as hex + length. Nested sequences show item count with optional one-level-deep recursion.

search mode — Scan across a directory looking for private tags matching a keyword in either the creator string or the tag value. Useful for hunting down where manufacturers hide specific parameters (e.g. "find any private tag with 'echo' in the creator or value").

Additional Considerations

  • Creator filtering: Filter by creator substring, e.g. creator="Philips" to only see Philips blocks.
  • Known tag dictionaries: Embed a small lookup table for commonly useful private tags (e.g. Philips (2005,xx10) for actual echo times). Start without this and add later.
  • Binary value display: Show first 64 bytes as hex + total length, rather than full dumps.

Suggested Next Steps

  1. Start by exploring the Erasmus Philips data with dicom_get_metadata using custom hex tags to see what private blocks exist and specifically chase down the TE=0 mystery.
  2. Do the same on Siemens and GE data to understand the differences.
  3. Once the patterns and conditional logic are clear, design the tool around real use cases.

2. dicom_compare_headers Directory Mode

Priority: Medium — useful for cross-series protocol checks but less urgent than private tags.

Problem

dicom_compare_headers currently requires 210 explicit file paths. For cross-series protocol validation (e.g. "are all MOST series using the same TR/FA across a study?"), you have to manually pick representative files from each series first.

Proposed Enhancement

Add a directory mode that automatically picks one representative file per series and compares them. This would enable single-call cross-series protocol checks.

Design Ideas

  • New parameter: directory as an alternative to file_paths
  • Auto-select one file per unique SeriesInstanceUID (first file encountered, or configurable)
  • Reuse existing comparison logic
  • Show series description in output to identify which series each column represents
  • Optionally filter which series to include (by description pattern or sequence type)

Last updated: 2026-02-25 — after adding 4 new tools (dicom_dump_tree, dicom_compare_uids, dicom_verify_segmentations, dicom_analyze_ti) and smoke testing against Philips, Siemens, and GE MOLLI/NOLLI data.