Preparations for Building Scalable Cloud DICOM-WEB Services

This article introduces the architectural design of a DICOM medical imaging system developed using Rust, which employs a modern technology stack including PostgreSQL as the primary index database, Apache Doris for log storage, RedPanda as the message queue, and Redis for caching. The system design supports both standalone operation and distributed scaling, fully leveraging the safety and performance advantages of the Rust programming language.

Overview of DICOM Medical Imaging System Architecture and Runtime Environment

Core Components

  • PostgreSQL: Primary index database storing core metadata such as patients, studies, and series
  • Apache Doris: Log storage for recording DICOM CStoreSCP service and WADO-RS service access logs
  • RedPanda: Message queue for handling asynchronous communication between systems
  • Redis: Cache layer to improve system response speed
  • Rust: Programming language utilizing the dicom-rs library for DICOM data processing

Service Modules

  • wado-storescp: DICOM CStoreSCP service that receives DICOM files and writes them to disk
  • wado-consumer: Consumes storage events from message queues, extracts metadata, and writes to databases
  • wado-server: DICOM WEB WADO-RS API interface implementation
  • wado-webworker: Periodically generates JSON-formatted metadata for accelerated access

Database Design

PostgreSQL Primary Index Database

PostgreSQL serves as the primary index database for storing core metadata, including:

  • dicom_state_meta: Stores patient, study, and series-level metadata
  • dicom_json_meta: Records sequence information requiring JSON-formatted metadata generation

Apache Doris Log Storage

Apache Doris is used to store various service logs:

  • DicomStateMeta: DICOM state metadata
  • DicomImageMeta: DICOM image metadata
  • WadoAccessLog: WADO access logs

Docker Compose Scripts

We assume your database server IP address is 192.168.1.14

Operating System: Ubuntu 22.04.5 LTS

PostgreSQL Setup

version: '3'

services:
  pgdb:
    image: ankane/pgvector:latest
    container_name: pgappx
    restart: always
    environment:
      POSTGRES_PASSWORD: "xDicom123"
      POSTGRES_USER: "root"
      PGTZ: "Asia/Shanghai"
    volumes:
      - ./pgdata:/var/lib/postgresql/data
      - ./pg_hba.conf:/var/lib/postgresql/data/pg_hba.conf
    ports:
      - "5432:5432"

Redis, RabbitMQ, and PgAdmin Setup

 version: '3'

services:
  redis:
    image: redis
    restart: always
    volumes:
      - ./redata:/data
    ports:
      - "6379:6379"
  rabbitmq:
    image: rabbitmq:management
    ports:
      - "5672:5672"
      - "15672:15672"
    environment:
      RABBITMQ_DEFAULT_USER: admin
      RABBITMQ_DEFAULT_PASS: xDicom123
    volumes:
      - ./rabbitmq/data:/var/lib/rabbitmq

  pgadmin:
    user: root
    container_name: pgadmin4_container
    image: dpage/pgadmin4:8.4
    restart: always

    environment:
      PGADMIN_DEFAULT_EMAIL: oscar.xdev@outlook.com
      PGADMIN_DEFAULT_PASSWORD: xDicom123
      PGADMIN_LISTEN_ADDRESS: 0.0.0.0
      PGADMIN_SERVER_JSON_FILE: /pgadmin4/servers.json
      TZ: Asia/shanghai
    ports:
      - "8080:80"
      - "9443:443"
    volumes:
      - ./pgadmin:/var/lib/pgadmin

Apache Doris and RedPanda Setup

Refer to the official documentation for Doris and RedPanda.

Starting Doris

./Doris3.X/3.1.0/fe/bin/start_fe.sh --daemon
./Doris3.X/3.1.0/be/bin/start_be.sh --daemon

RedPanda Message Queue Operations

  • Creating Topics:
rpk topic create dicom_image_queue  --partitions 1 --replicas 1
rpk topic create dicom_state_queue  --partitions 1 --replicas 1
rpk topic create log_queue          --partitions 1 --replicas 1
rpk topic create storage_queue      --partitions 1 --replicas 1
  • Clearing Topics:
rpk topic trim-prefix dicom_image_queue  -p 0 --offset end --no-confirm
rpk topic trim-prefix dicom_state_queue  -p 0 --offset end --no-confirm
rpk topic trim-prefix log_queue          -p 0 --offset end --no-confirm
rpk topic trim-prefix storage_queue      -p 0 --offset end --no-confirm

Database Initialization Scripts

  • Primary Database (PostgreSQL) Creation Script
create table dicom_state_meta
(
    tenant_id                varchar(64) not null,
    patient_id               varchar(64) not null,
    study_uid                varchar(64) not null,
    series_uid               varchar(64) not null,
    study_uid_hash           varchar(20) not null,
    series_uid_hash          varchar(20) not null,
    patient_name             varchar(64),
    patient_sex              varchar(1),
    patient_birth_date       date,
    patient_birth_time       time,
    patient_age              varchar(16),
    patient_size             double precision,
    patient_weight           double precision,
    pregnancy_status         integer,
    study_date               date        not null,
    study_date_origin        varchar(8)  not null,
    study_time               time,
    accession_number         varchar(16) not null,
    study_id                 varchar(16),
    study_description        varchar(64),
    modality                 varchar(16),
    series_number            integer,
    series_date              date,
    series_time              time,
    series_description       varchar(256),
    body_part_examined       varchar(64),
    protocol_name            varchar(64),
    series_related_instances integer,
    created_time             timestamp,
    updated_time             timestamp,
    primary key (tenant_id, study_uid, series_uid)
);


create unique index index_state_unique
    on dicom_state_meta (tenant_id, study_uid, series_uid, accession_number);

drop table if exists dicom_json_meta;
create table  dicom_json_meta
(
    tenant_id                varchar(64) not null,
    study_uid                varchar(64) not null,
    series_uid               varchar(64) not null,
    study_uid_hash           varchar(20) not null,
    series_uid_hash          varchar(20) not null,
    study_date_origin        varchar(8)  not null,
    flag_time                timestamp   not null,
    created_time             timestamp   not null default current_timestamp(6),
    json_status              int         not null default 0,
    retry_times              int         not null default 0
);

ALTER TABLE dicom_json_meta
ADD CONSTRAINT PK_dicom_json_meta PRIMARY KEY (tenant_id, study_uid, series_uid);
  • Apache Doris Database Tables and Stream Load Configuration
 drop  table IF   EXISTS  dicom_object_meta;
create table IF NOT EXISTS  dicom_object_meta
(
    tenant_id           varchar(64)   not null comment 'Tenant ID',
    patient_id          varchar(64)   not null comment 'Patient ID',
    study_uid           varchar(64)   null,
    series_uid          varchar(64)   null,
    sop_uid             varchar(64)   null,
    file_size           bigint        null,
    file_path           varchar(512) null,
    transfer_syntax_uid varchar(64)   null,
    number_of_frames    int           null,
    created_time        datetime      null,
    series_uid_hash     VARCHAR(20)   null,
    study_uid_hash      VARCHAR(20)   null,
    accession_number    varchar(64)   null,
    target_ts           varchar(64)   null,
    study_date          date          null,
    transfer_status     varchar(64)   null,
    source_ip           varchar(24)   null,
    source_ae           varchar(64)   null,
    trace_id            varchar(36)   not null comment 'Globally unique trace ID, as primary key',
    worker_node_id      varchar(64)   not null comment 'Worker node ID'
)
ENGINE=OLAP
DUPLICATE KEY(tenant_id,patient_id,study_uid,series_uid,sop_uid)
DISTRIBUTED BY HASH(tenant_id) BUCKETS 1
PROPERTIES("replication_num" = "1");


DROP TABLE IF  EXISTS dicom_state_meta;
CREATE TABLE IF NOT EXISTS dicom_state_meta (
    -- Basic identification information
    tenant_id VARCHAR(64) NOT NULL,
    patient_id VARCHAR(64) NOT NULL,
    study_uid VARCHAR(64) NOT NULL,
    series_uid VARCHAR(64) NOT NULL,
    study_uid_hash  VARCHAR(20)  NOT NULL,
    series_uid_hash  VARCHAR(20)   NOT NULL,
    study_date_origin VARCHAR(8) NOT NULL,

    -- Patient information
    patient_name VARCHAR(64) NULL,
    patient_sex VARCHAR(1) NULL,
    patient_birth_date DATE NULL,
    patient_birth_time VARCHAR(16) NULL,
    patient_age VARCHAR(16) NULL,
    patient_size DOUBLE NULL,
    patient_weight DOUBLE NULL,
    pregnancy_status INT NULL,

    -- Study information
    study_date DATE NOT NULL,
    study_time VARCHAR(16) NULL,
    accession_number VARCHAR(16) NOT NULL,
    study_id VARCHAR(16) NULL,
    study_description VARCHAR(64) NULL,

    -- Series information
    modality VARCHAR(16) NULL,
    series_number INT NULL,
    series_date DATE NULL,
    series_time VARCHAR(16) NULL,
    series_description VARCHAR(256) NULL,
    body_part_examined VARCHAR(64) NULL,
    protocol_name VARCHAR(64) NULL,
    
    -- Timestamps
    created_time DATETIME NULL,
    updated_time DATETIME NULL
)
ENGINE=OLAP
UNIQUE KEY(tenant_id, patient_id, study_uid, series_uid)
DISTRIBUTED BY HASH(tenant_id) BUCKETS 1
PROPERTIES("replication_num" = "1");


DROP TABLE IF   EXISTS dicom_image_meta ;
CREATE TABLE IF NOT EXISTS dicom_image_meta (
    -- Basic identification information
    tenant_id VARCHAR(64) NOT NULL COMMENT "Tenant ID",
    patient_id VARCHAR(64) NOT NULL COMMENT "Patient ID",
    study_uid VARCHAR(64) NOT NULL COMMENT "Study UID",
    series_uid VARCHAR(64) NOT NULL COMMENT "Series UID",
    sop_uid VARCHAR(64) NOT NULL COMMENT "Instance UID",

    -- Hash values
    study_uid_hash  VARCHAR(20) NOT NULL COMMENT "Study UID hash value",
    series_uid_hash   VARCHAR(20)   NOT NULL COMMENT "Series UID hash value",

    -- Time related
    study_date_origin DATE NOT NULL COMMENT "Study date (original format)",
    content_date DATE COMMENT "Content date",
    content_time VARCHAR(32) COMMENT "Content time",


    -- Image basic information
    instance_number INT COMMENT "Instance number",
    image_type VARCHAR(128) COMMENT "Image type",
    image_orientation_patient VARCHAR(128) COMMENT "Image orientation (patient coordinate system)",
    image_position_patient VARCHAR(64) COMMENT "Image position (patient coordinate system)",

    -- Image dimension parameters
    slice_thickness DOUBLE COMMENT "Slice thickness",
    spacing_between_slices DOUBLE COMMENT "Spacing between slices",
    slice_location DOUBLE COMMENT "Slice location",

    -- Pixel data attributes
    samples_per_pixel INT COMMENT "Samples per pixel",
    photometric_interpretation VARCHAR(32) COMMENT "Photometric interpretation",
    width INT COMMENT "Image rows",
    columns INT COMMENT "Image columns",
    bits_allocated INT COMMENT "Bits allocated",
    bits_stored INT COMMENT "Bits stored",
    high_bit INT COMMENT "High bit",
    pixel_representation INT COMMENT "Pixel representation",

    -- Reconstruction parameters
    rescale_intercept DOUBLE COMMENT "Reconstruction intercept",
    rescale_slope DOUBLE COMMENT "Reconstruction slope",
    rescale_type VARCHAR(64) COMMENT "Reconstruction type",
    window_center VARCHAR(64) COMMENT "Window center",
    window_width VARCHAR(64) COMMENT "Window width",

    -- Transfer and classification information
    transfer_syntax_uid VARCHAR(64) NOT NULL COMMENT "Transfer syntax UID",
    pixel_data_location VARCHAR(512) COMMENT "Pixel data location",
    thumbnail_location VARCHAR(512) COMMENT "Thumbnail location",
    sop_class_uid VARCHAR(64) NOT NULL COMMENT "SOP class UID",
    image_status VARCHAR(32) COMMENT "Image status",
    space_size BIGINT COMMENT "Occupied space size",
    created_time DATETIME COMMENT "Creation time",
    updated_time DATETIME COMMENT "Update time"
)
ENGINE=OLAP
UNIQUE KEY(tenant_id, patient_id, study_uid, series_uid, sop_uid)
DISTRIBUTED BY HASH(tenant_id) BUCKETS 1
PROPERTIES("replication_num" = "1");

---------------------------------------------------------
---------------------------------------------------------
---------------------------------------------------------
CREATE ROUTINE LOAD medical_object_load ON dicom_object_meta
COLUMNS (
    trace_id,
    worker_node_id,
    tenant_id,
    patient_id,
    study_uid,
    series_uid,
    sop_uid,
    file_size,
    file_path,
    transfer_syntax_uid,
    number_of_frames,
    created_time,
    series_uid_hash,
    study_uid_hash,
    accession_number,
    target_ts,
    study_date,
    transfer_status,
    source_ip,
    source_ae
)
PROPERTIES (
    "desired_concurrent_number" = "3",
    "max_batch_interval" = "10",
    "max_batch_rows" = "300000",
    "max_batch_size" = "209715200",
    "format" = "json",
    "max_error_number" = "1000"
)
FROM KAFKA (
    "kafka_broker_list" = "127.0.0.1:9092",
    "kafka_topic" = "log_queue",
    "kafka_partitions" = "0",
    "property.kafka_default_offsets" = "OFFSET_BEGINNING"
);


CREATE ROUTINE LOAD medical_state_load ON dicom_state_meta
COLUMNS (
        tenant_id  ,
        patient_id ,
        study_uid ,
        series_uid,
        study_uid_hash,
        series_uid_hash,
        study_date_origin,
        patient_name,
        patient_sex ,
        patient_birth_date ,
        patient_birth_time,
        patient_age,
        patient_size,
        patient_weight,
        pregnancy_status,      
        study_date,
        study_time,
        accession_number,
        study_id,
        study_description,
        modality,
        series_number,
        series_date,
        series_time,
        series_description,
        body_part_examined,
        protocol_name,
        series_related_instances,
        created_time,
        updated_time = NOW()
)
PROPERTIES (
    "desired_concurrent_number" = "3",
    "max_batch_interval" = "10",
    "max_batch_rows" = "300000",
    "max_batch_size" = "209715200",
    "format" = "json",
    "max_error_number" = "1000",
    "strip_outer_array" = "false"
)
FROM KAFKA (
    "kafka_broker_list" = "127.0.0.1:9092",
    "kafka_topic" = "dicom_state_queue",
    "kafka_partitions" = "0",
    "property.kafka_default_offsets" = "OFFSET_BEGINNING"
);



CREATE ROUTINE LOAD medical_image_load ON dicom_image_meta
COLUMNS (
    tenant_id,
    patient_id,
    study_uid,
    series_uid,
    sop_uid,
    study_uid_hash,
    series_uid_hash,
    study_date_origin,
    content_date,
    content_time,
    instance_number,
    image_type,
    image_orientation_patient,
    image_position_patient,
    slice_thickness,
    spacing_between_slices,
    slice_location,
    samples_per_pixel,
    photometric_interpretation,
    width,
    `columns`,
    bits_allocated,
    bits_stored,
    high_bit,
    pixel_representation,
    rescale_intercept,
    rescale_slope,
    rescale_type,
    window_center,
    window_width,
    transfer_syntax_uid,
    pixel_data_location,
    thumbnail_location,
    sop_class_uid,
    image_status,
    space_size,
    created_time,
    updated_time = NOW()
)
PROPERTIES (
    "desired_concurrent_number" = "3",

    "max_batch_interval" = "10",
    "max_batch_rows" = "300000",
    "max_batch_size" = "209715200",
    "format" = "json",
    "max_error_number" = "1000",
    "strip_outer_array" = "false"

)
FROM KAFKA (
    "kafka_broker_list" = "127.0.0.1:9092",
    "kafka_topic" = "dicom_image_queue",
    "kafka_partitions" = "0",
    "property.kafka_default_offsets" = "OFFSET_BEGINNING"
);

This creates 3 ROUTINE LOAD tasks corresponding to DICOM object metadata, DICOM state metadata, and DICOM image metadata. WADO-RS access logs can be handled similarly based on actual requirements.

For more information about building cloud DICOM systems, see our guide on how to build cloud DICOM systems.

This comprehensive guide provides all necessary preparations for building scalable cloud DICOM-WEB services, covering infrastructure setup, database design, and system architecture considerations for medical imaging applications.

GoTo Summary : how-to-build-cloud-dicom