# UDM HTTP API

The *UDM REST API* is a JSON-HTTP interface designed to interact with the Univention Directory Manager (UDM).
It follows a RESTful architecture, striving to maintain adherence to REST principles while also providing an OpenAPI schema.

The primary objectives of the *UDM REST API* are:
- Facilitating the management of LDAP directory data.
- Supporting efficient machine-to-machine communication (See reference client `univention.admin.rest.client`).
- Provide all necessary functionality to be future proof for replacint the UDM-UMC module with a new web UI interface.

We are committed to maintaining backwards compatibility while applying the REST principles allows the service to evolve.
We provide an OpenAPI schema and guarantee API stability when the latest schema is used.

## RESTful architecture

The *UDM REST API* adheres to the RESTful architectural style, as defined by Roy T. Fielding in his dissertation on [Architectural Styles and the Design of Network-based Software Architectures](https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm). REST, which stands for **RE**presentation **S**tate **T**ransfer, encompasses six architectural constraints and four interface constraints, making a service "RESTful."

This enables scalability of the interaction between components, genericicity of the interface, independend development of components, reduce latency via intermediate components, etc.

### How are the REST constraints satisfied?

1. Client-Server

    The *Client-Server* constraint enforces that there is a clear distinction between a passive server component and an active client component.
    The server component holds authority over the entire service realm and its meaning, while the client component must not make assumptions about the server logic (such as how URLs are constructed).
    This clear separation excludes pseudo Client-Server models, like RPC-based applications, where the server expands the programming interface into the clients, essentially making them one unit.

    **Why Client-Server separation matters**: The Client-Server constraint brings several benefits. It allows clients and servers to evolve independently, supporting separation of concerns and reducing interdependencies. Clients can focus on user interface and hypermedia aspects, while servers concentrate on the business logic, representing resources from their stored state.

    **Compliance**: The *UDM REST API* accomplishes this by providing hypermedia controls through *Link* relations, included in Hypertext Application Language (HAL) response representations, as well as "Link" HTTP response headers.
    These hypermedia controls offer discoverability, enabling clients to navigate the API efficiently and independently.

    **Compliance violations**: By providing an **OpenAPI Schema** the *UDM REST API* violates this constraint as autogenerated RPC client like our [UDM REST API OpenAPI schema based Python client](https://github.com/univention/python-udm-rest-api-client/) put the whole hypermedia controls (such as hardcoded URLs or HTTP status codes) into the client logic. Nevertheless adding it was a conscious choice aimed at enhancing the developer-friendly experience. To make future evolvability of the *UDM REST API* possible we require that the client is generated against the latest OpenAPI schema.

2. Stateless

    The *Stateless* constraint enforces that communication between clients and servers must be stateless i.e. each request must contain all the information necessary for the server to fully comprehend and process the request
    This means that the entire session state lies within the client, and the server/resource state should be stored in separate databases.
    This separation allows for scalability through the addition of server instances or processes, as each request can be handled independently.
    Consequently, mechanisms like cookies and authentication via form-based requests or SAML cannot be used.

    **Why statelessness matters**: Stateless communication simplifies the server's implementation and enables the scalability of services. Multiple server instances or processes can efficiently handle client requests without relying on shared session state, which can be a bottleneck in high-demand scenarios.

    **Compliance**: The *UDM REST API* accomplishes this by not storing any state across requests and utilyzing HTTP basic authentication for user authentication. Additionally, authentication via OIDC/Bearer will be provided in the future, which is compliant to the statelessness constraint.

    **Compliance violations**: The statelessness constraint is currently violated during operations that involve moving LDAP objects or renaming containers. Progress information for such operations is stored in shared memory, but only in the processes for the same locale. Further requests can poll the progress state. These operations are performed in separate threads rather than separate processes.
    The LDAP connections are bound to the basic authentication credentials, and open LDAP connections are cached between requests to optimize performance.

3. Cache

    The *Cache* constraint emphasizes the importance of marking data in a response as either explicitly or implicitly cacheable.
    This optimization enhances performance by reducing the need for repetitive requests to the server.

    **Compliance**: The *UDM REST API* complies with the Cache constraint by thoughtfully employing various HTTP response headers, including *Cache-Control*, *E-Tag*, *Last-Modified* and *Vary* to distinctly mark responses as (non-)cacheable.
    The *UDM REST API* takes into consideration both private and public cache flags, primarily for security reasons. These flags help determine the accessibility and scope of cached data, ensuring that sensitive information remains appropriately protected.
    In addition to the response headers, the *UDM REST API* also evaluates specific HTTP request headers, such as *If-Match*, *If-None-Match*, *If-Modified-Since*, *If-Unmodified-Since*.
    These headers serve two key purposes:
        1. **Revalidation and Reuse**: Clients can use these headers to revalidate and reuse cached responses, thereby reducing unnecessary data transfers.
        2. **Conditional requests**: These headers facilitate conditional PUT / DELETE requests when updating an object. This guards against parallel changes to the same object.

    **Compliance violations**:
    The *UDM REST API* is provided behind the Apache HTTP gateway server. A server cache could offer significant performance improvements.
    Currently, the Python client reference implementation does not yet incorporate a client cache.

4. Uniform Interface

    The *Uniform Interface* constraint requires components to communicate using generic and standardized data formats that are understandable by all components involved and that the interface must conform to the below described four interface constraints.
    For manipulating all server data the same uniform interface complying with the constraints has to be provided.
    This means there are no application-specific data formats or schemas invented, which ensures that all components, whether they are clients, servers, or other intermediaries, can work seamlessly with the API using the same standardized interface.
    While JSON is a standardized data format, it lacks built-in mechanisms for semantic meaning especially regarding hypermedia interaction (like defining data types, constraints, form actions).
    It primarily focuses on the structure and representation of data which makes it unsuitable for use as a uniform interface.

   1. identification of resources

       The *Identification of resources* constraint signifies that all information provided by the server is abstracted as a resource, and each resource must possess one or more names or identifiers, typically represented by a unique HTTP URI.
       These URIs are managed by the server, which holds authority over their assignment.
       Resource access should exclusively occur through these resource identifiers.
       URIs serve as straightforward identifiers and should not carry additional semantic meaning beyond identification.
       Consequently, clients should refrain from manually constructing URIs unless URI templates are explicitly provided by the server.
       Instead, clients should navigate through state transitions using links found within previously fetched representations, allowing them to follow hypermedia links and traverse the API without hardcoded URIs.
       This practice offers several benefits, including the ability for servers to change URIs without disrupting clients, particularly in scenarios where the API emphasizes providing link relations.

       **Compliance**: The *UDM REST API* conforms to this by assigning unique URLs to each unique resource, ensuring clear and consistent identification.
       The API provides link relations in all response representations, encompassing both IANA-registered standard links and self-defined link relations.
       This approach enables clients to navigate the API seamlessly, ensuring that they do not manually construct URIs, except in cases where URI templates are provided.

       The *UDM REST API reference client* upholds this constraint by abstaining from manual URI construction, with the exception of using URL templates from response representations. This practice allows clients to interact with resources in a uniform and agile manner.

       **Compliance violations**:
       The *OpenAPI schema* is not conforming to this because it hardcodes all URIs in the schema file and constructs them manually by inserting strings into path placeholders.
       This practice introduces concerns, as API consumers need to ensure they are consistently using the latest schema.
       Moreover, issues may arise when the constructed URIs are not retrieved from the server but self constructed, such as with UDM objects containing "//" in their DN identifier, which is an illegal URI path component.
       The server utilizes its encoding for double-slashes, which may evolve in the future, introducing potential compatibility challenges.

   2. manipulation of resources through representations

       A resource represents a set of entities, which can be reflected through representations or identified by URIs if a concrete realization of the concept does not yet exist.
       This fundamental principle implies that the state and representation of a resource can change dynamically over time while remaining the same resource.
       It's important to understand that a representation of a resource is not the resource itself.
       Resources can be represented in various formats, such as HTML, XML, JSON, LDIF (representing it's current state), key-value pairs (representing the desired state), images, or even error conditions like *404 Not Found*.
       Unlike RPC, where the representation acts as a direct proxy to an object (and allows to e.g. execute object oriented methods on it), in REST, state changes are achieved by inspecting the response and its provided ways to modify the representation.
       This involves selecting a transformation, creating or modifying a representation, and sending it back to the server.
       Furthermore, a representation is not a fixed, unchanging entity but a dynamic one that can evolve over time.

       **Compliance**: The *UDM REST API* adheres to this constraint by providing diverse representations of individual resources.
       These representations vary in terms of Content-Language (e.g., English, German), Content-Type (e.g., HTML, JSON, HAL-JSON), and different states, such as the modifyability of a UDM object.
       Importantly, these representations can change at any time, influenced by factors like the presence of extended attributes or changes in the upstream UDM library.
       Additionally, the *UDM REST API* offers server-side rendered JSON templates that come pre-populated with default values for object manipulation and creation, simplifying user interactions.

       **Compliance violations**:
       The *OpenAPI schema* does not conform to this constraint. The schema hardcodes request and response schemas for each object type and does not allow for dynamic responses or requests constructed from server-provided forms.
       This approach has limitations as it doesn't fully embrace the dynamic nature of resource representations and might not be avle to adapt effectively to changes in the API.

   3. self-descriptive messages

       *Self-descriptive messages* means that messages are transferred as representations, which consists of resource or request data (identified by a MIME media type), (resource or representation) metadata and control data.
       The MIME media type plays a critical role in specifying both the syntax and semantics of message payloads.
       Metadata, presented in the form of key-value pairs, serves to describe how to interpret the message, defines caching rules, provides authentication information, specifies encodings and languages of the representation, and more.
       In addition, links to other resources must be explicitly included in responses and should never be hidden within client logic.
       Control data, a form of metadata that describes metadata, enables various functionalities, including conditional requests, security settings, message integrity checks, protocol switching, adding proxy or gateway information, supporting partial responses, and facilitating content negotiation.

       This constraint empowers intermediate layers to understand and potentially transform all exchanged messages.
       Self-descriptive messages, when combined with HATEOAS, eliminate the need for extensive API documentation, making the API inherently understandable.

       **Compliance**: The *UDM REST API* conforms to this by leveraging and evaluating self-descriptive HTTP headers such as `Content-Type` / `Accept`, `Content-Language` / `Accept-Language`, `E-Tag` / `If-None-Match`, `If-Match`, `Cache-Control`, `Last-Modified` / `If-Unmodified-Since` / `If-Modified-Since`, `Vary`, `Authorization` / `WWW-Authenticate`, `Link´, `Retry-After` / `Location`, `User-Agent`, `Allow`.
       Furthermore, content negotiation based on these headers allows the server to select the most suitable representation to send to the client, promoting flexibility and adaptability in communication.

       **Compliance violations**: TODO

   4. hypermedia as the engine of application state (HATEOAS)

       The "Hypermedia as the Engine of Application State" (*HATEOAS*) constraint signifies that representations must not only convey data but also include information to drive the application's state.
       Each response should encompass all available state transfer possibilities, whether through HTML forms, links to state changes, URI template, or other relevant resources.
       Furthermore, responses must contain links and their associated relation types to other resources, providing a roadmap for clients to navigate without requiring prior hard-coded knowledge of resource interactions.

       Hypermedia refers to data formats that can incorporate hyperlinks and other hypermedia elements, such as forms.
       While standard JSON is not inherently a hypermedia format, there exist specifications like JSON-LD, UBER, SIREN, HAL, Collection+JSON, and Hydra that extend JSON to include hypermedia elements.
       Additionally, HTML, when equipped with libraries like HTMX, can embrace many modern HTTP and JavaScript features in a declarative, hypermedia-compatible RESTful manner.

       *HATEOAS* imposes two key requirements. First, the media type must be known to the client and be sufficiently rich to describe all potential client-server interactions. Second, the client should follow only links included in the representation and should not construct identifiers without user interaction.

       **Compliance**: The *UDM REST API* demonstrates compliance with various aspects of HATEOAS. It provides links in the HTTP Link response header and includes links and embedded resources in the HAL+JSON responses.
       An unsupported HTML developer view is also available, offering a multitude of links between UDM objects and forms for object modification.
       A future goal is to enhance this view with HTMX, potentially replacing the UDM-UMC interface.
       Additionally, representations of UDM modules and objects are enriched with metadata and HAL controls that specify available actions, such as searching, removing, modifying, moving, or creating objects.

       **Compliance violations**: TODO: HAL+JSON is not a hypermedia format and it cannot contain detailed instructions on how to construct requests for modifying objects.

5. Layered System

    The *Layered System* constraint extends the fundamental *Client-Server* model by introducing intermediate components that possess the ability to fully understand and manipulate messages.
    These intermediaries leverage the principles of *Stateless* and *Self-Descriptive Messages* to enhance the architecture.
    Crucially, since each layer operates behind a uniform interface, clients or components are shielded from the specifics of the layers they interact with.
    This opacity of layers adds a remarkable degree of flexibility and adaptability to the system.

    Intermediate components in a layered system encompass various functions, including the possibility of proxies, gateways, client and server caches, and load balancers.
    These components play a pivotal role in routing, optimizing, and securing communication.
    For instance, authentication or authorization can be effectively handled by a gateway component, streamlining security considerations.

    **Compliance**: The *UDM REST API* conforms to this by operating in a stateless manner, ensuring that each request carries all necessary information for understanding and processing.
    The API also maintains self-descriptive messages, particularly in relation to caching rules, providing transparency regarding message semantics and structure.
    The layered system in the *UDM REST API* serves multiple purposes. It allows the API to deliver content in multiple languages.
    Additionally, it segregates specific security headers and static content delivery, efficiently leveraging the capabilities of the Apache gateway.

    **Compliance violations**: None known

6. Code on Demand (Optional)

    The *Code on Demand* constraint grants servers the optional ability to extend client functionality by embedding code, such as scripts or applets, into representations.
    This extensibility empowers servers to enhance the capabilities of client applications, particularly in web-based environments.
    However, it's important to note that this constraint is optional, as it comes with the trade-off of potentially limiting availability to clients capable of executing the embedded code.
    Typically, this constraint is considered in domains where it is known that clients possess the necessary capabilities to support these extensions.

    **Compliance**: The *UDM REST API* uses this optional constraint. The unsupported web user interface makes use of the HTMX javascript library to extend HTML functionality. The JSON-API part of the *UDM REST API* will never require this optional constraint.

    As of the current implementation, the *UDM REST API* employs the *Code on Demand* constraint for an additional prototype web user interface (beside the JSON interface).
    It's worth exploring the utilization of the HTMX JavaScript library, which allows for the extension of HTML functionality, enabling features like interactive JSON forms or support for HTTP PUT/DELETE requests.
    Adopting this approach would eliminate the need to create a complex and resource-intensive Vue.js single-page application frontend, which would effectively reimplement the entire server and widget logic.
    Importantly, it's worth noting that this extension is contemplated as an optional enhancement and should not become a requirement for utilizing the JSON-API part of the *UDM REST API*.

    **Compliance violations**: None known

### Versioning and API stability

The *UDM REST API* intentionally avoids the use of API versioning, a practice that is common in many web APIs.
The decision to forgo versioning aligns with the principles of REST.

Within the *UDM REST API*, a range of changes can be expected or imagined over time:
* A property or extended attribute or extended option could be added or removed from a UDM module
* The representation of a property changes (e.g. changing a list into a dict, changing a syntax class, splitting "name" into "firstname" and "lastname")
* A property is now required when creating an object
* A different HTTP status code is required in a response to e.g. a move operation
* A object must be created via POST instead of PUT because the final URI is not known by the client
* The representation of a object changes (e.g. additionally added fields for the raw LDAP values)
* The behavior of an UDM object changes (e.g. referenced objects are removed when deleting an object)
* The syntax class is now more restrictive (e.g. the username doesn't allow numeric-only values anymore)
* A UDM module or property should be renamed (e.g. unifying all DNS records into one DNS record type instead of individual types)
* A new concept "actions" gets implemented, which extends the current concepts of UDM objects ("properties", "options" and "policies")

There are many common practices of versioning web APIs:
* putting a version into the URI path: `/v1/udm/` or `/udm/users/user/v1/`
* putting a version into the URI query: `/udm/foo?version=1`
* putting a version into the domain: `https://apiv1.example.org/`
* implementing a custom HTTP request header: `X-Version: 1`
* creating a custom MIME media type: `Accept: application/vendor.company.appv1+json`
* creating a version parameter for your custom MIME media type: `Accept: application/vendor.company.app+json; version=1`
* using the User-Agent header: `User-Agent: univention-lib/1.0`
* etc

They are all wrong and come with severe disadvantages and maintainability problems (you can research this on your own, there are lot of articles).
Versioning also has nothing to do with *REST*, the word isn't even mentioned in its definition because it's an anti-pattern and makes the the architecture *unRESTful*.
One of the fundamental constraints of REST is that each resource/entity must have a unique URI.
Therefore, versioning practices that alter URI pathes for different versions of the same resource is a violation of this constraint.

Even if it would be *RESTful*, what exactly do we want to put the version number on?
When using a URI path the whole service and everything would be versioned. Is that really necessary?
How often do we expect breaking changes that cannot be handled otherwise?
How often can we enforce all consumers to exchange their clients?
How many of the Univention Web APIs are still using version 1? All.
How to keep the old code for /v1/ working when exposing a /v2/? If the behavior of the underlying UDM library changes, we couldn't prevent it without a massive layering violation or code duplication. Writing adapters for every UDM module and property? Surely not!
How would versioning solve all the individual imaginable changes mentioned above?

Remarkably, the UDM framework permits most of these changes without the need for versioning.
The *UDM REST API* follows a compatibility contract that extends across UCS major versions.
This contract ensures that clients and servers from the same UCS version are inherently compatible, facilitating seamless communication.

Some practices for dealing with changes without versioning:
* clients should be designed to gracefully ignore what they do not understand and anticipate data extensions.
* avoid making incompatible changes to data structures.
* adding adapters to accept both old and new data formats can ease the transition when changes do occur.
* make clients future compatible when a change should be done in the next major version
* encourage regular client updates
* send a *user agent* within your client, that in case we really need to make a exception, we can do behavior only for certain clients
* Use *HATEOAS* so that the client has no out-of-bound information but always operates on the current data
* Version the hypermedia controls; not the service, resources or representation
* ...

## Architecture

The *UDM REST API* consist of a Apache gateway/reverse proxy, a gateway and the server itself.
The gateway and server are using the [Tornado Web Framework](https://github.com/tornadoweb/tornado/).
The split in gateway and server component realize the multi-language support.

### Code
The gateway is defined in `src/univention/admin/rest/server/__init__.py`.
The server is defined in `src/univention/admin/rest/__main__.py`.
All HTTP resources are currently defined in `src/univention/admin/rest/module.py`.

The `module.py` it too large and should be splitted into multiple smaller modules.
Additionally the implementation relies on the UDM UMC module implementation.
The dependency order should be reversed by creating a common base implementation which can be used in both components.
([Bug #50118](https://forge.univention.org/bugzilla/show_bug.cgi?id=50118)).

The code is kept near to the UDM UMC module implementation to avoid subtle differences. In the future the web user interface should use *UDM REST API* directly.
Therefore it's strongly necessary to keep this behavior.

### Language support
The *UDM REST API* starts one gateway process and one subprocess for each configured system locale.
All HTTP requests are forwarded to the language specific process by evaluating the `Accept-Language` header.
Each main process also starts another subprocess for sharing memory between those processes via a `multiprocessing.SyncManager`.

UDM currently translates strings at Python import time which makes it impossible to use two languages in one process.

### System roles
The *UDM REST API* is designed to be run only on a Primary Directory Node.
On Replica Directory Node it is not ensured that every UDM module exists or is available in the latest version.
The replication might also lag behind so that data is not accurate.
The only writeable LDAP server is the LDAP server on the Primary Directory Node.

### Authentication
The *UDM REST API* can only be accessed via HTTP Basic authentication.
For the authentication the username and password have to be provided. A special username is `cn=admin`, which is the LDAP root dn.
Authentication via DN or mail address is not possible.
In the future HTTP Bearer authentication should be implemented to support authentication via a JWT including a ID and/or Access Token.

### Authorization
The authorization at the *UDM REST API* is done via group membership. By default Domain Admins, DC Backup Hosts and DC Slave Hosts are allowed to access it.
This can be configured via the UCR variable `directory/manager/rest/authorized-groups/.*`.
There is no individual access restriction to certain modules, properties or operations. Instead the LDAP ACL's determine the access rights.
In the future fine-grained authorization should be achieved by integrating Open Policy Agent.

### multiprocessing and shared memory
The *UDM REST API* can be scaled horizontally by starting it with multiprocessing via the UCR variable `directory/manager/rest/processes`.
A value of `0` uses the number of CPU cores as number of subprocesses.
Mutliprocessing requires sharing of memory necessary.
We are sharing memory via `multiprocessing.managers.SyncManager()` with information about the child process IDs, authentication cache, a queue for move operation progress information, and LDAP cookies for paginated search operations.

## OpenAPI interface
The *UDM REST API* provides an OpenAPI schema definition, which allows to autogenerate RPC clients.
The OpenAPI schema is available at `/univention/udm/openapi.json`.
There is also a OpenAPI schema for each object type e.g. at `/univention/udm/users/user.json`. This URI is unsupported and may be removed in the future.
A web interface for the OpenAPI schmea is provided via `/univention/udm/schema/`.
A client implementation using the OpenAPI schema is provided at [python-udm-rest-api-client](https://github.com/univention/python-udm-rest-api-client/).

## CLI Client
There is an unsupported CLI client and client library called `__udm` which tries to behave similar to the real `udm`.
It can be used to easy test basic operations.
A real supported client will follow in the Future™.

## Known issues

* After creating, modifying or moving extended attributes or extended options the affected modules are not reloaded until the server restarts or special endpoints are requested, which do a expensive reload e.g. `/univention/udm/users/user/add`. Patch available at [Bug #50253](https://forge.univention.org/bugzilla/show_bug.cgi?id=50253).
* The pagination is unsupported because the SSSLVL overlay module is not enabled in the LDAP server ([Bug #50240](https://forge.univention.org/bugzilla/show_bug.cgi?id=50240) and [Bug #54786](https://forge.univention.org/bugzilla/show_bug.cgi?id=54786)).
