Evaluation Layer
After videos are ingested and pre-processed, they arrive at the Evaluation Layer, where ORN transforms raw visual content into structured knowledge. If ingestion secures authenticity and pre-processing guarantees quality & privacy, annotation is where ORN begins to interpret, evaluate, and judge what is happening in each video. This stage not only extracts meaning from unstructured pixels but also determines how well a task was performed and how that performance contributes to a user’s reputation score within EgoPlay.

The process begins with object and action detection. Each video is analyzed frame by frame to identify the entities and activities that define the task. Everyday items such as dishes, clothing, or tools are recognized alongside human motions like folding, cutting, shuffling, or cleaning. Temporal segmentation links these objects and actions into coherent sequences, mapping the flow of events from start to finish. For example, in a video of a user preparing a meal, ORN identifies the ingredients, the utensils, and the cooking motions, then validates the before-and-after states of the task.
Once the task is recognized, ORN applies Skill Scoring. Each submission is compared against a reference library of expert demonstrations curated for that activity. For example, a sushi-making challenge may be benchmarked against a dataset of professional chefs performing the same task. ORN measures how closely the user’s sequence of actions, timing, precision, creativity and outcome match the expert baseline. Was the sushi rolled cleanly? Were the steps followed in the correct order? Was the presentation consistent with professional standards? The closer a submission aligns to the expert model, the higher the score assigned. In this way, scoring is not arbitrary, it reflects an objective assessment of how well the task was performed relative to the best available demonstrations.
Instead of a single generic reputation, ORN builds multidimensional profiles organized by activity type - such as kitchen, sports, or outdoor tasks - and reinforced by cross-cutting attributes like dexterity, precision, endurance, and creativity. A contributor might excel in kitchen tasks with high precision, while another demonstrates strength in outdoor activities requiring endurance. ORN captures these nuances and reflects them in the user’s profile.
By the time a video exits the Evaluation Layer, it has been transformed into more than just a labeled clip. It becomes a scored and contextualized record of human activity, benchmarked against expert references, and linked directly into a contributor’s Skill Tree. This dual role, interpreting content and evaluating ability, makes evaluation one of the most critical stages of ORN's architecture.
All thresholds, parameters, and detection methods described are subject to continuous refinement as technology advances and as the requirements of the ecosystem evolve.
Last updated