NAV
CURL JSON Response

Introduction

Welcome to the API of Docparser! You can use this API to

The Docparser API is organized around REST. Our API has predictable, resource-oriented URLs, and uses clear response messages to indicate API errors.

The cURL requests in the right sidebar are designed to show you how to call our API. All you need to do is to replace the secret_api_key in the sample with your private API token.

There are no language bindings other than cURL yet. If you are developing an API client in your preferred language, please let us know and we will gladly feature it here.

This documentation was last updated 2017-06-25.

Authentication

curl https://api.docparser.com/v1/ping \
   -u <secret_api_key>:
{"msg": "pong"}

Authenticate your account when using the API by including your secret API key in the request headers. You can obtain and reset your secret API key in the API Settings of your Docparser Account. Your API key carries many privileges, so be sure to keep them secret!

Authentication to the API is performed via HTTP Basic Auth. Provide your API key as the basic auth username value. You do not need to provide a password.

All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests without authentication will also fail.

You can test if the authentication works by pinging the following URL. Please make sure to include the correct authentication headers as described above.

GET https://api.docparser.com/v1/ping

Parsers

List Document Parsers

curl https://api.docparser.com/v1/parsers \
   -u <secret_api_key>:
[{
  "id":"mwekrupomwekrupo",
  "label":"Acme Inc. Invoice Parser"
},{
  "id":"cadqtvgjcadqtvgj",
  "label":"Acme Corp. Invoice Parser"
}]

This endpoint returns a list of all Document Parsers linked to your account. Each entry contains an id and a label. The id value an be used in other API routes, e.g. for importing documents to a specific Document Parser or obtaining parsing results.

GET https://api.docparser.com/v1/parsers

Documents

Import Documents

We offer several options to import a document to Docparser to make it as easy as possible for you to integrate Docparser in your existing workflow.

Next to manually uploading your documents with our app you can also use this API. You have the choice between a HTML form upload request, or providing a publicly accessible URL which can be used to fetch the document.

Hint: Another easy way of importing your documents is to forward them by e-mail to a private inbox linked to your account. You can learn more about this method in the settings of your Document Parser.

Upload Document

curl \
  -u <secret_api_key>: \
  -F "file=@/home/your/local/file.jpdf" \
  https://api.docparser.com/v1/document/upload/<PARSER_ID>
{
    "id" : "abc123efg456",
    "file_size" : 119540,
    "quota_used" : 642,
    "quota_left" : 258,
    "quota_refill" : "2017-05-02T02:43:48+00:00"
}

Uploading a document to Docparser works like uploading a file with a HTML form. All you need to do is to send a form-data request to the API endpoint containing the document in the form field file.

The return value of a successful upload is the ID of the newly created document, the filesize of the imported document as well as account usage data.

Each of your Document Parsers has a unique API route to which you need to send your request. The <PARSER_ID> shown in the URL below can be obtained by calling the List Parsers API route. You can also easily obtain the <PARSER_ID> inside the Docparser app in the settings of your Document Parser under Settings > API.

In addition, you can submit an arbitrary string to Docparser which will be stored together with the uploaded document. The submitted value (remote_id) will be kept throughout the processing and will be available later once you obtain the parsed data with our API, as CSV/XLS/XML file or through webhooks. This optional parameter makes it easy to relate the parsed data returned by Docparser with document records in your own system.

POST https://api.docparser.com/v1/document/upload/<PARSER_ID>

Parameter Description
file The file object to upload
remote_id Optional parameter to pass through your own document ID

Fetch Document From URL

curl \
  -u <secret_api_key>: \
  -F "url=http://www.pdf995.com/samples/pdf.pdf" \
  https://api.docparser.com/v1/document/fetch/<PARSER_ID>
{
    "id" : "abc123efg456",
    "file_size" : 119540,
    "quota_used" : 642,
    "quota_left" : 258,
    "quota_refill" : "2017-05-02T02:43:48+00:00"
}

If your files are stored under a publicly accessible URL, you can also import a document by providing the URL to our API. This method is really straight forward and you just need to perform a simple POST or GET request with url as the parameter.

The return value of a successful fetch request is the ID of the newly created document, the filesize of the imported document as well as account usage data.

Each of your Document Parsers has a unique API route to which you need to send your request. The <PARSER_ID> shown in the URL below can be obtained by calling the List Parsers API route. You can also easily obtain the <PARSER_ID> inside the Docparser app in the settings of your Document Parser under Settings > API.

In addition, you can submit an arbitrary string to Docparser which will be stored together with the fetched document. The submitted value (remote_id) will be kept throughout the processing and will be available later once you obtain the parsed data with our API, as CSV/XLS/XML file or through webhooks. This optional parameter makes it easy to relate the parsed data returned by Docparser with document records in your own system.

POST https://api.docparser.com/v1/document/fetch/<PARSER_ID>

GET https://api.docparser.com/v1/document/fetch/<PARSER_ID>

Parameter Description
url The location of a publicly accessible document
remote_id Optional parameter to pass through your own document ID

Parsed Data

Docparser provides a couple of different ways to obtain the data parsed from your documents. Basically, you have the following three options:

Get One Data Set

curl \
  -u <secret_api_key>: \
  https://api.docparser.com/v1/results/<PARSER_ID>/<DOCUMENT_ID>
[
    {
        "id": "967bcf5658d73c80563072373d5002e3",
        "document_id": "1d35639d4b53b59e77f737c93cd1d3d7",
        "remote_id": "your_optional_id",
        "file_name": "pdf.pdf",
        "media_link": "https://api.docparser.com/v1/document/media/...",
        "media_link_original": "https://api.docparser.com/v1/document/media/.../original",
        "media_link_data": "https://api.docparser.com/v1/document/media/.../data",
        "page_count": 4,
        "uploaded_at": "2016-07-27T14:57:05+00:00",
        "processed_at": "2016-07-27T14:57:10+00:00",
        "purchase_number": "ABC123",
        "customer": {
            "last_name" : "Doe",
            "first_name" : "John"
        },
        "table_data": [{
            "key" : "value"
        }, {
            "key" : "value"
        },
        ...
        ],
        "....": "...."
    }
]

This API route returns the parsed data of one document. The response structure is identical to the list route above, only that the contains a single object representing the data of the requested document.

The <PARSER_ID> shown in the URL below can be obtained by calling the List Parsers API route. You can also easily obtain the <PARSER_ID> inside the Docparser app in the settings of your Document Parser under Settings > API.

The <DOCUMENT_ID> is returned when uploading/importing a new document.

GET https://api.docparser.com/v1/results/<PARSER_ID>/<DOCUMENT_ID>

Parameter Default Description
format object Valid values are object or flat. By default, parsed document data is returned as nested JSON objects. Setting this parameter to flat will return a simplified version of the parsed data which does not contain flat key/value pairs instead of nested objects.

Get Multiple Data Sets

curl \
  -u <secret_api_key>: \
  https://api.docparser.com/v1/results/<PARSER_ID>
[
    {
        "id": "967bcf5658d73c80563072373d5002e3",
        "document_id": "1d35639d4b53b59e77f737c93cd1d3d7",
        "remote_id": "your_optional_id",
        "file_name": "pdf.pdf",
        "media_link": "https://api.docparser.com/v1/document/media/...",
        "media_link_original": "https://api.docparser.com/v1/document/media/.../original",
        "media_link_data": "https://api.docparser.com/v1/document/media/.../data",
        "page_count": 1,
        "uploaded_at": "2016-07-27T14:57:05+00:00",
        "processed_at": "2016-07-27T14:57:10+00:00",
        "purchase_number": "ABC123",
        "customer": {
            "last_name" : "Doe",
            "first_name" : "John"
        },
        "table_data": [{
            "key" : "value"
        }, {
            "key" : "value"
        },
        ...
        ],
        "....": "...."
    },
    {
       ....
    }
]

This API route returns a list of JSON objects representing the parsed data. By default, the data of the last 100 documents in reverse chronological order. Additional parameters can be used to change this default behaviour.

The <PARSER_ID> shown in the URL below can be obtained by calling the List Parsers API route. You can also easily obtain the <PARSER_ID> inside the Docparser app in the settings of your Document Parser under Settings > API.

GET https://api.docparser.com/v1/results/<PARSER_ID>

Parameter Default Description
format object Valid values are object or flat. By default, parsed document data is returned as nested JSON objects. Setting this parameter to flat will return a simplified version of the parsed data which contains flat key/value pairs instead of nested objects.
list last_uploaded Valid values are last_uploaded, uploaded_after and processed_after. By default, the data of the last uploaded documents in reverse chronological order is returned. If set to uploaded_after, documents imported after a certain date are returned (see below). If set to processed_after, documents parsed after a certain date are returned (see below).
limit 100 This parameter indicates how many documents to include when the parameter list is set to last_uploaded. The maximum quantity of documents which can be returned is limited 10,000.
date This parameter is mandatory if the parameter list is set to uploaded_after or processed_after. The parameter needs to be a valid ISO 8601 (e.g. 2017-02-12T15:19:21+00:00) date string or a Linux timestamp and determines which documents are included in the return. Please note that the maximum quantity of documents which can be returned is limited 10,000.
remote_id When this parameter is set, only documents having the provided remote_id will be returned. The remote_id of a document can be set when importing the file via the API (see above).