> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pingintel.com/llms.txt
> Use this file to discover all available pages before exploring further.

# How Ping Processes SOVs

> High-level behavior of SOV processing in **Ping.Extraction**

**Ping.Extraction** turns a raw Statement of Values workbook into structured, normalized
building data, then generates that data in the output formats your organization uses.
This page explains the behavior behind that process: the lifecycle of a parsed SOV,
the status values you poll, how updates (SUDs) behave, and how to regenerate outputs
without re-uploading the source file. For the terms and identifiers used throughout,
see [Concepts & Identifiers](/concepts-and-identifiers). For step-by-step API call
sequences, see the workflows on the [Getting Started](/quickstart) page.

## How an input SOV becomes a Ping SOV

Parsing is asynchronous. One request starts the job, and everything after the upload
happens on Ping servers while you poll for completion.

1. **Upload.** Call
   [Start SOV Parsing Job](/ping-extraction/parse-sovs/start-sov-parsing-job)
   with the source file, a `document_type` of `SOV`, the `output_formats` you want
   generated, and any `integrations` to apply. The response returns the SOV ID
   immediately, before processing finishes. Save it. Every later request about this
   SOV uses it.
2. **Identification and parsing.** Ping identifies the document type, locates the
   building rows in the workbook, and maps the source columns onto standard Ping
   attributes. The [JSON Format Specification](/json-formats/top-level) describes the
   resulting structure, and [Buildings Attributes](/json-formats/buildings) covers the
   per-building fields.
3. **Enrichment.** Requested integrations add third-party data, such as geocoding
   results and hazard scores, to each building.
4. **Output generation.** Ping renders the normalized data into each requested format.
   Formats you skip here can still be generated later. See
   [Regenerating outputs](#regenerating-outputs).
5. **Completion.** Poll
   [Check SOV Parsing Status](/ping-extraction/parse-sovs/check-sov-parsing-status)
   until `request.status` is `COMPLETE` or `FAILED`. On success, `result.outputs`
   lists each generated file with a `url`. Download them with
   [Fetch Outputs of SOV Parsing Job](/ping-extraction/parse-sovs/fetch-outputs-of-sov-parsing-job).

Two behaviors worth knowing up front. Submitting the same file twice creates two
unrelated SOVs with two different IDs, so re-uploading is never the way to get fresh
outputs from an already-parsed SOV. The `id` inside a JSON output behaves
differently from the job IDs: it holds the parent SOV ID and keeps that value in
every regenerated output, so updating an SOV does not change the `id` in its JSON.
And the SOV ID is opaque. Its shape (`s-pl-ping-21nyms3`) is observable but not a
contract, so store it as a string and pass it through unchanged.

## Status values across endpoints

Every job moves through one lifecycle, but different endpoints encode it differently.
Job status endpoints —
[Check SOV Parsing Status](/ping-extraction/parse-sovs/check-sov-parsing-status)
and [Check SOV Update Status](/ping-extraction/update-sovs/check-sov-update-status) —
return full words in `request.status`.
[List Historical SOVs](/ping-extraction/get-sov-data/list-historical-sovs) returns
single-letter codes in `status`. The encodings map one to one:

| Lifecycle state | Job endpoints (`request.status`) | History (`status`) |
| :-------------- | :------------------------------- | :----------------- |
| Pending         | `PENDING`                        | `P`                |
| In progress     | `IN_PROGRESS`                    | `I`                |
| Enriching       | `ENRICHING`                      | `E`                |
| Re-enriching    | `REENRICHING`                    | `R`                |
| Complete        | `COMPLETE`                       | `C`                |
| Failed          | `FAILED`                         | `F`                |

`COMPLETE` and `FAILED` are the only terminal states. Poll until you see one of them
rather than matching the intermediate values, which can change as processing evolves.
List Historical SOVs only returns records that have already reached `C` or `F`, and
[Get/Check SOV Output Result](/ping-extraction/parse-sovs/getcheck-sov-output-result)
reports only `PENDING`, `COMPLETE`, and `FAILED`.

`result.status` is a separate axis. `request.status` tells you whether the job
finished, and `result.status` tells you how it ended. Parse jobs report `SUCCESS`,
`FAILED_TO_READ`, `PARTIAL_PARSE`, `FAILED_TO_PARSE`, or `FAILED_TO_PROCESS`. Update
jobs report `SUCCESS`, `FAILED_TO_PARSE`, or `FAILED_TO_PROCESS`. On failure,
`result.message` describes what went wrong.

## How SOV updates (SUDs) work

A SUD (SOV Update Data) is a revision of an existing SOV. Create one when you have
corrected building attributes, refreshed enrichments, or upstream data changes to
apply, and you want regenerated outputs without re-uploading the original workbook.
The [Update an SOV](/workflows/ping-extraction/update-sov) workflow walks through the
call sequence: initiate the update against the parent `sovid`, upload one or more
CSVs of revised attributes, then start the job.

When you use a SUD, expect the following behavior:

* **A SUD gets its own ID.** A SUD ID is the parent SOV ID plus `-r` and the
  zero-padded revision number. Revision 1 of `s-pl-ping-21nyms3` is
  `s-pl-ping-21nyms3-r001`, and revision 10 is `s-pl-ping-21nyms3-r010`. The
  suffix always matches the record's `revision` value.
* **The parent SOV is preserved.** An update creates a new revision rather than
  rewriting the original. The `revision` counter starts at `0` for the original parse
  and increments with each SUD, and you can still request outputs for any earlier
  revision. See [Regenerating outputs](#regenerating-outputs).
* **SUDs process like parse jobs.** The update runs asynchronously. Poll
  [Check SOV Update Status](/ping-extraction/update-sovs/check-sov-update-status)
  until `request.status` is `COMPLETE` or `FAILED`, then download the regenerated
  files from `result.outputs`.
* **SUDs appear in account history.**
  [List Historical SOVs](/ping-extraction/get-sov-data/list-historical-sovs)
  returns SOVs and SUDs in one chronological stream. A `record_type` of `ORIG` marks
  an original SOV, and any other value (for example `SCRUB` or `API`) marks a SUD.
* **Lineage is traceable.**
  [Get Historical SOV](/ping-extraction/get-sov-data/get-historical-sov)
  returns `original_sovid` and `previous_sovid`, which trace a SUD back through its
  revision chain to the original SOV.
* **A SUD ID works where an SOV ID does for fetching outputs.** For example,
  [Get Or Create SOV Output](/ping-extraction/get-sov-data/get-or-create-sov-output)
  accepts either. When the path ID is a SUD ID, the `revision` parameter is ignored
  because the SUD already identifies one specific revision.

## Regenerating outputs

A reoutput generates a new output file from an SOV that is already parsed. The source
file is not re-uploaded and the SOV ID does not change. There are two ways to get one,
and they answer different needs.

**You need a format you didn't request at parse time, or a fresh copy of an existing
file.** Call
[Get Or Create SOV Output](/ping-extraction/get-sov-data/get-or-create-sov-output).
If a matching output is already cached, the response returns it immediately with a
`request.status` of `COMPLETE`. Otherwise generation starts and you poll
[Get/Check SOV Output Result](/ping-extraction/parse-sovs/getcheck-sov-output-result)
with the returned request ID. Set `overwrite_existing` to `true` to discard the cached
file and force regeneration. Set `revision` to select which revision to read: `0` for
the original parse, a positive integer for that revision number, or `-1` (the default)
for the latest. The
[Get or Create an SOV Output](/workflows/ping-extraction/get-or-create-output)
workflow walks through this end to end.

**You want the SOV reprocessed with Ping's current processing.**
Initiate an update job with an `update_type` of `SCRUB`, as described in
[Update an SOV](/workflows/ping-extraction/update-sov), then start it. Uploading a
locations CSV is optional, so a `SCRUB` job can skip the add-locations step entirely.
To refresh third-party data, pass `integrations` when starting the job. When
`integrations` is omitted, your organization's workflow configuration determines
whether default enrichments run. The result is a new SUD, so you get a new revision
with regenerated outputs rather than a replacement file on the existing revision.

The difference in one line: Get Or Create SOV Output renders existing parsed data into
a file, while a `SCRUB` update reprocesses the data itself and creates a new revision.
