SweMetrics

OpenAlex Bibliometric Data Service at SUNET Cloud

477M
Works
106M
Authors
2.8B
Citations
255K
Sources
103K
Institutions

Swemetrics.se is a portal to the OpenAlex Bibliometric Data Service at SUNET Cloud. The purpose of this project is to evaluate and establish a national Swedish data source for bibliometic analyses, based on Open Alex snapshot data. Such a data source could facilitate data sharing and analysis workflows for bibliometric analysts at Swedish universities, research funders and other organizations related to the Swedish research ecosystem.

The project is currently in an early development and testing phase, at the moment mainly as a collaboration between KTH Royal Institute of Technology Library, Karolinska Institutet Library and SUNET.

Flight SQL (ADBC)

High-performance remote database access via Arrow Flight SQL. Connect from R, DuckDB, Python or any ADBC-compatible client.

grpc+tls://swemetrics.se:31337

REST API

HTTP API for SQL queries. Returns JSON results. Requires authentication. Read API docs:

/duckdb/api/docs...

Web Interface

Interactive SQL query interface. Explore data visually in your browser.

/ui/

Object Storage / S3 Buckets

Sync data directly from S3 / object storage. Tabular compressed .parquet files, single-file duckdb or custom extracts

s3.swemetrics.se

Database Clients

Connect using various desktop (or WASM) database clients (desktop or WASM) or use programming languages like R, Python etc

try dbeaver or gizmosql-ui

Usage examples for tools and LLM integrations

API usage guide and recommendations for analysts, tool integrations or LLM usage

see API_GUIDE_FOR_LLMS.md

Quick Start

Connect with DuckDB CLI using adbc_scanner:

INSTALL adbc_scanner FROM community; LOAD adbc_scanner;
CREATE SECRET (TYPE adbc, SCOPE 'grpc+tls://swemetrics.se:31337', ...);
ATTACH 'grpc+tls://swemetrics.se:31337' AS db (TYPE adbc);
FROM db.works SELECT * LIMIT 10;

Connect with DuckDB CLI against the API:

-- read json, parquet or csv from the api
from read_json('https://swemetrics.se/duckdb/api/works?limit=10&api_key=your_key_here'), unnest(data) _(x) select x.work_id, x.title;
from read_parquet('https://swemetrics.se/duckdb/api/works?limit=10&default_format=parquet&api_key=your_key_here');
from read_csv('https://swemetrics.se/duckdb/api/works?limit=10&default_format=csv&api_key=your_key_here');

Connect with R and use ADBC database connections or read CSV/JSON/parquet from the API:

# csv - read directly from API url using api_key
readr::read_csv("https://swemetrics.se/duckdb/api/institutions?limit=100&default_format=csv&api_key=your_api_key_here")

# parquet - typed, automatically retrieves bigint as integer64
arrow::read_parquet("https://swemetrics.se/duckdb/view/api_works_food?limit=100&default_format=parquet&api_key=your_api_key_here")

Connect with Python using ADBC:

import adbc_driver_flightsql.dbapi as flight_sql

conn = flight_sql.connect(
    "grpc+tls://swemetrics.se:31337",
    db_kwargs={"username": "...", "password": "..."}
)
cursor = conn.cursor()
cursor.execute("FROM works SELECT * LIMIT 10")