Import your face in a 3D scene, in live!
This JavaScript library:
- Get the camera video stream,
- Detects and track the user's face
- Crop the face and evaluate the depth
All is done in real-time, in a standard web browser. We provide a THREE.js demo where the 3D face of the user is inserted into a 3D scene. However, this library is framework agnostic and can be user with any web 3D engine.
Here are the main features of the library:
- video acquisition,
- face detection,
- face tracking,
- face cropping,
- inference of face depth,
- global 3D face pose estimation,
- robust to difficult lighting conditions,
- mobile friendly.
/demos/
: demonstration source code,/dist/
: core of the library:WebARRocksFaceDepth.js
: main minified script,WebARRocksFaceDepth.module.js
: main minified script for module use (withimport
orrequire
),
/helpers/
: scripts which can help you to use this library in some specific use cases,WebARRocksFaceDepthThreeHelper.js
: makes the bridge between THREE.js and this lib,
/neuralNets/
: neural networks models,NN_FACEDEPTH_DEPTH_<version>.json
: neural networks computing the depth from a face cropped image,NN_FACEDEPTH_TRACK_<version>.json
: neural networks detecting and tracking the face,
/libs/
: 3rd party libraries and 3D engines used in the demos,
Here are the demonstrations included in this repository:
- Three.js avatar: live demo, source code
- Ready Player Me avatar: live demo, source code
The best way to get started is to take at the demo.
-
<function> callbackReady
: This function is called when the lib is initialized, with 2 argument:<string|false> errorCode, <object> spec
. This function is called again if an error happens. The objectspec
has the following properties:GL
: the WebGL context. The rendering 3D engine should use this WebGL context,canvasElement
: the<canvas>
element,videoTexture
: a WebGL texture displaying the camera video. It has the same resolution as the camera video,[<float>, <float>, <float>, <float>]
videoTransformMat2: flatten 2x2 matrix encoding a scaling and a rotation. We should apply this matrix to viewport coordinates to rendervideoTexture
in the viewport,<HTMLVideoElement> video
: the video used as source for the webgl texturevideoTexture
,
-
<function> callbackTrack
: This function is called at each iteration loop, with an object as argument with these properties:<float> detected
: the face detection probability, between0
and1
,<boolean> isDetected
: whether the face is detected or not<float> x
,<float> y
: The 2D coordinates of the center of the detection frame in the viewport (each between -1 and 1,x
from left to right andy
from bottom to top),<float> s
: the scale along the horizontal axis of the detection frame, between 0 and 1 (1 for the full width). The detection frame is always square,<float> rx
,<float> ry
,<float> rz
: the Euler angles of the head rotation in radians,<Uint8Array> RGBDBuf
: RGBD buffer of the face,<int> RGBDRes
: resolution of the RGBD buffer.
-
<string> canvasId
or<HTMLCanvasElement> canvas
: the canvas where the computation will be performed, -
<string>NNTrackPath
or<object> NNTrack
: the neural network model used for face detection and tracking, -
<string>NNDepthPath
or<object> NNDepth
: the neural network model used for depth inference, -
<integer> animateDelay
: With this statement you can set accurately the number of milliseconds during which the browser wait at the end of the rendering loop before starting another detection. If you use the canvas of this library as a secondary element (for example in PACMAN or EARTH NAVIGATION demos) you should set a smallanimateDelay
value (for example 2 milliseconds) in order to avoid rendering lags. -
<function> onWebcamAsk
: Function launched just before asking for the user to allow its camera access, -
<function> onWebcamGet
: Function launched just after the user has accepted to share its video. It is called with the video element as argument, -
<dict> videoSettings
: override MediaStream API specified video settings, which are by default:
{
'videoElement' // not set by default. <video> element used
// If you specify this parameter,
// all other settings will be useless
// it means that you fully handle the video aspect
'deviceId' // not set by default
'facingMode': 'user', // to use the rear camera, set to 'environment'
'idealWidth': 800, // ideal video width in pixels
'idealHeight': 600, // ideal video height in pixels
'minWidth': 480, // min video width in pixels
'maxWidth': 1280, // max video width in pixels
'minHeight': 480, // min video height in pixels
'maxHeight': 1280, // max video height in pixels,
'rotate': 0 // rotation in degrees possible values: 0,90,-90,180
},
<dict> scanSettings
: overrides face scan settings - seeset_scanSettings(...)
method for more information.<dict> stabilizationSettings
: overrides tracking stabilization settings - seeset_stabilizationSettings(...)
method for more information.<boolean> isKeepRunningOnWinFocusLost
: Whether we should keep the detection loop running even if the user switches the browser tab or minimizes the browser window. Default value isfalse
. This option is useful for a videoconferencing app, where a face mask should be still computed if the FaceFilter window is not the active window. Even with this option toggled on, the face tracking is still slowed down when the FaceFilter window is not active.
The initialization function ( callbackReady
in the code snippet ) will be called with an error code ( errCode
). It can have these values:
false
: no error occurs,"GL_INCOMPATIBLE"
: WebGL is not available, or this WebGL configuration is not enough (there is no WebGL2, or there is WebGL1 without OES_TEXTURE_FLOAT or OES_TEXTURE_HALF_FLOAT extension),"ALREADY_INITIALIZED"
: the library has been already initialized,"NO_CANVASID"
: no canvas ID was specified,"INVALID_CANVASID"
: cannot find the<canvas>
element in the DOM,"INVALID_CANVASDIMENSIONS"
: the dimensionswidth
andheight
of the canvas are not specified,"WEBCAM_UNAVAILABLE"
: cannot get access to the camera (the user has no camera, or it has not accepted to share the device, or the camera is already busy),"GLCONTEXT_LOST"
: The WebGL context was lost. If the context is lost after the initialization, thecallbackReady
function will be launched a second time with this value as error code,
-
resize()
: should be called after resizing the<canvas>
element to adapt the cut of the video, -
toggle_pause(<boolean> isPause)
: pauses/resumes, -
set_animateDelay(<integer> delay)
: Changes theanimateDelay
(seeinit()
arguments), -
set_inputTexture(<WebGLTexture> tex, <integer> width, <integer> height)
: Changes the video input by a WebGL Texture instance. The dimensions of the texture, in pixels, should be provided, -
reset_inputTexture()
: Comes back to the user's video as input texture, -
get_videoDevices(<function> callback)
: Should be called before theinit
method. 2 arguments are provided to the callback function:<array> mediaDevices
: an array with all the devices founds. Each device is a javascript object having adeviceId
string attribute. This value can be provided to theinit
method to use a specific camera. If an error happens, this value is set tofalse
,<string> errorLabel
: if an error happens, the label of the error. It can be:NOTSUPPORTED
,NODEVICESFOUND
orPROMISEREJECTED
.
-
set_scanSettings(<object> scanSettings)
: Overrides scan settings.scanSettings
is a dictionnary with the following properties:<float> threshold
: detection threshold, between0
and1
. Default value is0.75
. You can decrease it if you want to make the detection more sensitive (but it will increase the false positive detections),<int> nDetectsPerLoop
: specifies the number of detections per drawing loop.0
for adaptative value. Default:0
<int> nScaleLevels
: number of detection steps for the scale. Default:3
,[<float>, <float>, <float>] overlapFactors
: overlaps between 2 scan positions forX
,Y
andscale
. Default:[2, 2, 3]
,<float> scale0Factor
: scale factor for the largest scan level. Default is0.8
.
-
set_stabilizationSettings(<object> stabilizationSettings)
: Overrides detection stabilization settings. The output of the neural network is always noisy, so we need to stabilize it using a floating average to avoid shaking artifacts. The internal algorithm computes first a stabilization factork
between0
and1
. Ifk==0.0
, the detection is bad and we favor responsivity against stabilization. It happens when the user is moving quickly, rotating the head or when the detection is bad. On the contrary, ifk
is close to1
, the detection is nice and the user does not move a lot so we can stabilize a lot.stabilizationSettings
is a dictionnary with the following properties:[<float> minValue, <float> maxValue] translationFactorRange
: multiplyk
by a factorkTranslation
depending on the translation speed of the head (relative to the viewport).kTranslation=0
iftranslationSpeed<minValue
andkTranslation=1
iftranslationSpeed>maxValue
. The regression is linear. Default value:[0.0015, 0.005]
,[<float> minValue, <float> maxValue] rotationFactorRange
: analogous totranslationFactorRange
but for rotation speed. Default value:[0.12, 0.25]
,[<float> minValue, <float> maxValue] qualityFactorRange
: analogous totranslationFactorRange
but for the head detection coefficient. Default value:[0.85, 0.95]
,[<float> minValue, <float> maxValue] alphaRange
: it specifies how to applyk
. Between 2 successive detections, we blend the previousdetectState
values with the current detection values using a mixing factoralpha
.alpha=<minValue>
ifk<0.0
andalpha=<maxValue>
ifk>1.0
. Between the 2 values, the variation is quadratic. Default value is[0.05, 0.9]
, It only applies to global pose stabilization. Landmarks are stabilized using helpers (/helpers/WebARRocksLMStabilizer<X>.js
).
-
update_videoElement(<video> vid, <function|False> callback)
: changes the video element used for the face detection (which can be provided viaVIDEOSETTINGS.videoElement
) by another video element. A callback function can be called when it is done. -
update_videoSettings(<object> videoSettings)
: dynamically change the video settings (see Optional init arguments for the properties ofvideoSettings
). It is useful to change the camera from the selfie camera (user) to the back (environment) camera. APromise
is returned. IfvideoSettings = null
, the video is stopped and the camera is toggled off. -
destroy()
: Cleans both graphic memory and JavaScript memory, uninit the library. After that you need to init the library again. APromise
is returned. -
is_winFocus()
: Return if the current window has focus or not (For example if the user has changed the browser tab if will returnfalse
). This function works only if init optionisKeepRunningOnWinFocusLost
is set totrue
.
You should host the content of this repository using a HTTPS static server.
Be careful to enable gzip HTTP/HTTPS compression for JSON and JS files. Indeed, the neuron network JSON file, neuralNets/NN_<xxx>.json
is quite heavy, but very well compressed with GZIP. You can check the gzip compression of your server here.
This library relies on WebAR.rocks WebGL Deep Learning technology to detect and track the user's face using a neural network. The accuracy is adaptative: the best is the hardware, the more detections are processed per second. All is done on the client-side.
- If
WebGL2
is available, it usesWebGL2
and no specific extension is required, - If
WebGL2
is not available butWebGL1
, we require eitherOES_TEXTURE_FLOAT
extension orOES_TEXTURE_HALF_FLOAT
extension, - If
WebGL2
is not available, and ifWebGL1
is not available or neitherOES_TEXTURE_FLOAT
orOES_HALF_TEXTURE_FLOAT
are implemented, the user is not compatible.
If a compatibility error is triggered, please post an issue on this repository. If this is a problem with the camera access, please first retry after closing all applications which could use your device (Skype, Messenger, other browser tabs and windows, ...). Please include:
- a screenshot of webglreport.com - WebGL1 (about your
WebGL1
implementation), - a screenshot of webglreport.com - WebGL2 (about your
WebGL2
implementation), - the log from the web console,
- the steps to reproduce the bug, and screenshots.
This code repository is dual licensed. You have to choose between these 2 licenses:
- GPLv3 (free default option)
- Nominative commercial license: please contact-us for more information
For more information, please read the LICENSE file.