Finding hidden objects in an augmented reality [AR] experience is fun; especially when the items have interactive qualities which trigger new content, clues and narrrative. Furthermore, given the quality of 3D assets, AR lighting, shadow and reflection prediction and generation; coupled with a the Noirscape black & white filter, it’s a great little feature to be able to capture an augmented view within the home and share with others on social media or messaging.
The functionality allow the participant to to take an AR photo snap of their new found fictional objects withing the realworld space of their home. The image can then be shared on social media or sent to a recipient using the device’s sharing API. This provides content for the user to share, a memory from the experience and marketing value for the app brand.
In the above image, the dots which represent the AR Plane object (the horizontal plane detected in the real environment). Using the the Flutter ArCore package, there was no way to hide these once the AR object had been placed and it spoils the photograph somewhat. Fortunately I was able to fork the main package and add the functionality and send over a pull release request to the main package maintainer. So, I am now able to hide these dots just prior to takeing the capture.
Augmented Reality (AR) provides a way to place realistic looking virtual objects into a realworld scene. While the object may merely exist upon the screen of a phone; there are features to AR which combine the worlds; fiction and reality; beyond the two dimensional surface of a smartphone. For example, when a 3D object is placed within the AR scene; it may not be physically present upon the targetted surface; but certain qualities of the said surface are projected into the machine generated final composition (horizontality, width, height and distance of the plan) and ultimately this data is processed within the mind of the person who momentarily accepts the presence of a fictional item within their immediate realworld environment.
There are a few ways to look at this. One way is the simple matter of tricking the mind. I myself, during testing of my AR functionality had a fall [no long term damage to me or phone] while walking about and testing placement of fictional virtual items in my home. My perception of the realworld space around me was tricked by the magic window of the smartphone through which I had been focusing for several minutes while placing a 1940’s vintage telephone on various surfaces. I got confused and tripped over. Maybe it was just me being clumsy; but there is, it seems to me, something to be said about the nature of fiction and how we, as humans, easily incorporate representations of real items into our mental processing of reality. This isn’t even new, or a result of the emergence of hitech. Whenever we we look at a photograph, we are looking at paper and markings; yet we see a person, or a real thing. Everyone recognises Magrite’s famous painting on this subjec; The Treachery of Images.
In Magritte’s case, the representative object in question is more evidentally two dimensional and contrived. Although the mind interprates the markings as being a placeholder for a real pipe; that is all it is – a placeholder. In the same way that the word P I P E is a placeholder for the physical and usable real thing. In Magritte’s painting, the image in question is a question of linguistic semantics. The children’s book style in which the image is constructed intends to poke fun at the way we learn to identify things from images in the same way we do from words; in this case the pipe image is little more than a word; a modern hieroglyphic.
About the same time as the famous illustration, Magritte published a fascinating article in the newspaper, ‘The Revolutionary Surrealist’ entitled ‘les images & les mots’ (‘Words & Images’) in which he shares a number of observations or platitudes, maybe, surrounding the nature of words, images and their role in our interpretation of reality.
The above illustrations are about depictions of reality and the way in which our minds relate to the concept of things; particularly within the scheme of language and words; but also just the nature of things. For example, one image remarks how an object leads one to believe that there are other objects behind it. Or, from another page, how the visible contours or objects, in reality, touch one another in a mosaique manner.
So, what impact does Augmented Reality have, in respect to these kinds of platitudes? the expectance that objects hide other objects and so forth. AR techniques allow devices and software to imitate reality and then embed the imitation within reality; capturing the direction of light within the scene, to cast convincing looking shadows and reflections. In the case of Noirscape; a participant discovers a fictional telephone and can place the telephone in their realworld environment rather convincingly. The app also features a rotary dialler which is associated with the fictional telephone. Within the scope of the app; the dialler is used to call fictional characters. But, if it were to be connected to the device’s real calling capabilities; in other words, the participant can call someone in the realworld through interacting with the representative dial of the AR telephone; then its hard for me to make a distinction between using a physical phone or the augmented reality one. I think maybe, in this case, it is no longer a question of being a fictional telephone; but rather a virtual telephone. Whereas Magritte depicts a pipe, I cannot smoke a mere depiction. Whereas, I could hook up my fictional telephone to the realworld and make a call.
The challenge is to keep the app’s media files small so that they can be served on-demand from a content delivery network (CDN). This ensures only content that is relevant to the current user is downloaded and the main app size is kept small. Instead of using a mp4 or other type of movie file to convey the town 360 scene I am using the new Google webp format which I have previously validated within the 360 Flutter component. My aim is to keep this file as small as possible so it can be quickly served from a remote server. Other content such as voice and 3d characters which are part of the experience but not specific to a town may be compiled as assets with the build as they will not need to update as often as other interactions. There is also the option to stream the non-town specific video over the underlying 360 animated webp image file.
Using an Adobe Media Encoder plugin I was able to export a short section of my 360 movie into a webm file. The movie version of the webp format. However, this format turned out not to be supported by the 360 component I am using. So, I am looking to convert to webp as I will not need embedded audio; which can be played from a separate file.
I found that Google provides decent documentation for webp as well as a number of command line programs to help convert.
The conversion worked nicely. I am now able to open up my town scene with movement; in this case rainfall. However, (there is always a however…) the 360 plugin supports the standard Flutter Image widget which in turn support webp animated images, but I am so far unable to loop. So, the animation stops after the final frame.
As the 360 visual content of the spatial narrative is town based; the relevant media file is served from the CDN. Its important that the image files are as optimized as possible do they transfer quickly. Sound effects will be embedded within the app as they are common to all users, whereas narrative speech is also, like the town based visual content, based on the user’s language but also the path they take through the adventure based on their decision making. So, it also makes sense to serve this content from CDN. Speech files do not need to load instantly as they are usually triggered through an interaction. So, the final recipe involves playing embedded sound effects and an embedded visual effect while the main 360 media loads. The media is then cached so the slight delay is only noticable on the first playback. The narrative sound is then played on top of the 360 visual content at the appropriate moments.
For the sound I have opted for an AI generated voice; albeit with intonation and a Hollywood accent.
Noirscape experiments with a cross reality approach to narrative. Participants search for and find fictional objects in their own physical home through the AR (augmented reality) feature. One of these objects is a door and it’s keyhole a doorway between an augmented view of one world and the entirely fictional world of another. Furthermore, Noirscape binds the narrative with physical spaces in the nearby town. In my case and for the purpose of the pilot version of Noirscape, this is the French town of Bourg-en-Bresse.
I previously carried out fieldwork collecting 360 panoramic photographic content in and around the town; at over twenty locations selected not neccesarily for their prominence, but also for their intrigue; whether that be a curious street name, a bygone brewery turned warehouse or the site of a house where a celebrated artist and philosopher one lived. The noirscape experience will take participants into their town where segments of narrative are reveals through what I call flashbacks – these are a staple component of film noir; where the protagonist, who is usually since deceased or condemmed to a life of prison, recounts events from the past which played an important role in their current whereabouts [or absence of].
My challenge is to take my 360 content from the town and combine it with fictional Noir narrative to give an augmented or combined immersive experience whereby the content is triggered only by visiting the place in the physical world and at which point, a flashback from a fictional past occurs. To achieve this I decided to work with digital 3d character creation and animation. I had previously arranged to work with a friend who is also an actor; but, it’s complicate right now with the pandemic, to meet up and spending enough quality time to get something filmed; I was planning to use my green screens and then take the content into the 360 editor using Adobe After Effects and Premier Pro. One thing lead to another and I opted for digital characters. I initially hoped I’d be able to use Adobe software but they have discontinued their Fuse product which was a character designer app that could be used with Mixamo, their recently acquired character animation service. I decided to use Reallusion’s Character Creator instead due to the vast amount of resources available. I used Headshot, their AI face generator to base character on my own face (although I’ve reworked it somewhat since!) and I imported custom objects like a Fedora hat and set up the character in a black coat.
Next I took the character into iClone, Reallusion’s 3D animation suite. The challenge with iClone was to be able to bring in my 360 photo and create my own scene within the panorama. However, I ran into problems with this at first. While export to 360 panorama format is suported in iClone, I couldn’t achieve this using photography without experiencing problems with the way the image was being wrapper; due to distortion at the poles of the sphere if the Skybox object. The Skybox object in iClone and more generally in 3D design, is the imagery used to define the farthest most visible details; this would normally be the sky, hence the name; but may also be a distant mountain range. Usually this would only be thought of as a backdrop, with far more focus on the foreground and midground detail. In my case the Skybox would be represented by a complete 360 photo, in which I would place 3D assets like a person, a vehicle, etc.
I discussed the issue in the Reallusion support forum; and one solution put forward was to create my own 3d sphere object and set my 360 image as the texture. This did produce a slightly better outcome but not satisfactory enough for what I need. The Reallusion is fantastic nontheless; what I am seeking to do is certainly not a typical user-case by any means. One really good feature with iClone, and one of the key reasons for settings a photo as the Skybox, is for calculating light within a scene. The iClone software will identify from the image, in my case the 360 photo, which direction light is coming from and therefore where to cast light and shade within the 3D assets added to the scene. So, although I chose not to use iClone with the 360 photo visible, I still used it for the lighting work.
Within iClone I applied some subtle animation to my character; his tie blows in the wind and he blinks and moves a little while he waits for his rendez-vous. I applied rain effects with splashes and flickering light effects. In order to export my animation without the Skybox image so that I could bring it into Adobe After Effects I would need to export as an image sequence so ensure a transparent background. The sequence is 30 seconds long and 30 frames per second; so the software rendered 900 images in total which I then imported into After Effects.
Within After Effects the first challenge was to align the two-dimensional representation of my sequence within a 360 environment. If I place it as-is then it will be foreably bent into a banana shape as it is interprated through a 360 viewer. So, to avoid this, it’s important to alter the curvature of the 2d assets to align with the 360 image in equirectangular-panoramic format.
I’m generally pleases with the outcome and although it took quite a bit of time to get what I wanted, I now have a documented workflow for the process; I have a character ready to deplot to new scenarios and the knowhow to create others much more quickly. A small issue I have with the end result is that the animation is too subtle to really see properly on a mobile device; but this is easily tweaked. For now, I’m going to settle with what I have for the purpose of integrating with the app. The next step is to create a looping image based version of the scene in webp format as I have shown in a previous post. I will then play the audio channel, with the voice narration and sound effects via the app/device rather than the media file itself. This will keep the size of the media file down and allow me to serve the localised element (the view using footage from a specific town) and the global content – the spoken narrative.
The main issue I’m experiencing currently is the inability to set the scale of an object. I’ve now forked the main repository and I’m going to look deeper into it and make any changes I may need.
The problem I have is that when a 3d object is rendered it has a colossal scale; I assumed this was constrained by some kind of Vector3 limit on the camera. For example I am rendering a vintage telephone that I’d like to place on a table in the camera view. However, it’s appearing the size of an gigantic spaceship.
There is a scale parameter for the Node (which represents the 3d model of a telephone in ar-speak) but this doesn’t not appear to have any effect. The parameter takes a Vector3 object (x,y,z) which is a relative measurement to the parent object. However, given that the Node’s parent object is not something I have access to, I can’t set this. Eitherway, I’ve trying setting the scale to tiny values but it makes no difference. I’ve also tried wrapping the Node in other nodes but this hasn’t helped either.
I have checked out the underlying ARCore Java library and understand that the scale ought to be relative to the estimated size of the detected Plane (the horizontal plane of my desktop for example). This size is taken from the realworld estimated coordinates and should be at least accurate to a metre. The attributes are ExtendX and ExtentY. From these values it should be possible to scale the Node relatively. I’m going to check out the Java source code and see if I can spot anything.
Reformatting the Object File I couldn’t find anything wrong in the code at first glance. The object scale should be relative to the plane upon which it’s placed. So, I turned to my object files again. I noticed that while the earlier tests using the KhronosGroup images were big (oversized yellow duck!) they were spaceship size. So, my attention turned to the GLTF coding of my images. I went through the specification again and cross checked the Duck file with my telephone come spaceship one. It’s not easy to see anything amiss like this as it’s all about transformations and rotations – numbers; which are all relative to one another. But, I did have thought about the origins of these 3d objects. I got them from Sketchfab, where you can download them directly in GLTF format. Great! Maybe not. I noticed that even Windows 3D viewer couldn’t open my the telephone. I went back to Sketchfab and downloaded the telephone again, but this time in USDZ format. A format created by Pixar that’s becoming more and more associated with AR design. It’s a single file with the textures etc incorporated; I imported this into Adobe Dimensions and the first thing I noticed was a spaceship sized telephone. I panned out of the ‘scene’ to see the telephone at it’s more earthly scale. My hypothesis is that Sketchfab auto-convert the source objects into GLTF as scenes rather than just objects. This could explain why the scale issues. I hope this is the case, anyway. I’ll export the telephone from Dimensions in GLTF format and test it in AR again.
Once exported, I moved the files into my web project from which I’m serving these objects from the web.
And deploy to firebase hosting:
The result was certainly in the right direction. It’s no longer the size of USS Enterprise but seems to be fixed to the size of the detected plane, which I suspect is estimated at one square metre; and its just floating about in the air like a drone. I shall work on the scaling further and try to understand why it’s not anchoring to the plane correctly.
This is a good place to copy a few reference points from the Java API docs at Google for ARCore, as they are written succintly and help to keep in mind the different concepts of AR development.
“Describes the current best knowledge of a real-world planar surface.”
“Represents an immutable rigid transformation from one coordinate space to another. As provided from all ARCore APIs, Poses always describe the transformation from object’s local coordinate space to the world coordinate space“
“As ARCore’s understanding of the environment changes, it adjusts its model of the world to keep things consistent. When this happens, the numerical location (coordinates) of the camera and Anchors can change significantly to maintain appropriate relative positions of the physical locations they represent.These changes mean that every frame should be considered to be in a completely unique world coordinate space.”
I was eventually able to scale my 3D object correctly using a combination of GLTF settings and ARCore config.
With some Shader work within Flutter I’ve created a Noiresque look in which the vintage 1940’s 3D telephone I got from Sketchfab (see link in video description) is positioned consistently in the AR or ‘Mixed Reality’ world based on the detected horizontal Plane, the Pose of the object and of course the World Coordinate Space.