J2-3047 - Context-aware on-device approximate computing

Mobile computing delivered an unprecedented rate of innovation in the last decade, redefining our approach to communication, information seeking, and decision making. With a growing reliance on mobile devices it becomes increasingly important for mobile computing to continue to proliferate.

Processing hardware improvements and denser hardware packing have been staple methods for satisfying our computing appetite in the last decade. Yet, recently it became obvious than neither the Moore's law, stating that the number of transistors on a microchip doubles every two years, nor Dennard scaling, a law stating that as transistors get smaller the power use stays proportional to the area, hold any more. Consequently, further processing hardware packing would require both more space and more power (i.e. larger batteries), something which small form-factor portable mobile devices cannot afford. Yet, as mobile computing affirms its central role in our everyday life, we entrust an ever increasing number of complex computational tasks to it. Computer vision, speech recognition, physical activity tracking, and location-based recommendations are just some of the tasks that we expect our mobile devices to handle. These novel applications are based on computationally demanding calculations, often rely on deep neural networks, and are already maximally straining computational capabilities of mobile devices. As our appetite for computation grows, it becomes clear that mobile innovation remains critically threatened by the hardware limitations.

With the proposed project we aim to ensure future proliferation of mobile computing, and in particular mobile deep learning, by pioneering Context-aware On-device Approximate computing (CODA). Rather than advancing the hardware, we introduce a resource-preserving controllable reduction of computing accuracy, thus also the amount of computation needed to complete a task. CODA is based on an insight that depending on the context, a result need not be perfectly accurate in order to fulfil a users needs. We observe that context-varying needs exist over a range of applications: from a mobile app for elderly care that needs to be more more accurate in detecting activities (e.g. falls) when a user is alone, as compared to when a user is with her caretaker, to a personal assistant app that may produce a slightly shuffled order of recommendations when a user asks for nearby restaurants, but should produce maximum quality recommendations when a user asks a medical treatment-related question.

To realise CODA we will:

- Enable variable-accuracy execution on mobile devices and profile result accuracy - resource utilisation space.

- Model users' receptivity of sub-accurate results in various contexts.

- Control and adapt approximate execution in real time to minimise resource usage while satisfying user's accuracy requirements in the given context

CODA will significantly reduce the resources needed for completing a task on a mobile device, thus provide the basis for future advanced applications. More fundamentally, it will bridge the gap between users' perception-based expectations and the result of the computation and expand the dimensionality of the mobile computing optimisation space.

The project will deliver a full-fledged pipeline for creating context-aware approximate mobile computing applications, and will include: a Keras interface allowing a developer to define approximable parts of the code, a profiler calculating the most appropriate approximations for a required result accuracy level, an experience sampling and context sensing library for learning a user's context-dependent needs, a control system for real-time approximation adjustment according to varying needs, and a client-server distribution system for preparing and managing CODA apps. Finally, we will develop a backcountry skiing safety app showcasing context-aware approximation adaptation and resource savings enabled by CODA.

Research activity

Engineering sciences and technologies

Range on year

1,56 FTE

Research organisations

Institut Jožef Stefan

Researchers

Veljko Pejović, Octavian-Mihai Machidon, Alina Luminita Machidon, Davor Sluga, Ivan Majhen, Bojan Blažica (IJS)

Project phases and their realization

The project is composed out of four work packages (WPs):

1. Approximation embedding and profiling; where approximate computing techniques will be implemented on mobile devices. In addition, a tuner will be devised to efficiently identify the most promising configurations of the above approximation techniques.

2. Context-aware accuracy expectation modelling; where a users context-dependent needs with respect to computation accuracy will be assessed. For this purpose, we will implement an experience sampling method approach on mobile devices.

3. Real-time monitoring and execution control; where the accuracy of the calculated result will be tracked and where approximate configurations will be dynamically loaded, as needed.

4. Integration and case studies; where we will compile the developed elements of the CODA framework into a full-fledged mobile application; the application will be delivered to real-world users and the benefits of approximate mobile computing will be thoroughly evaluated.

Project bibliographic references

T. Knez, O. Machidon, and V. Pejovic

Self-Adaptive Approximate Mobile Deep Learning

Electronics (2021)

O. Machidon, D. Sluga, and V. Pejovic

Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity

EuroMLSys workshop with EuroSys 2021

O. Machidon, T. Fajfar, and V. Pejovic

Watching the Watchers: Resource-Efficient Mobile Video Decoding through Context-Aware Resolution Adaptation

EAI MobiQuitous 2020

V. Pejovic

Towards Approximate Mobile Computing

ACM GetMobile Magazine, Vol 22(5), December, 2018.

Financed by

Slovenian Research Agency

Collaborators