Hybrid Video Coding at High Bit-Rates
In this project we explore the prevalent hybrid video-coding concept that joins transform-coding and motion-compensation. Specifically, we study the necessity of transforming the motion-compensation prediction residuals for their coding at high bit-rates.
Our research relies on empirical statistics from a simplified motion-compensation procedure implemented in Matlab, and also from the reference software of the state-of-the-art HEVC standard. Our results show that the correlation among the motion-compensation residuals gets lower as the bit-rate increases, supporting the marginal use of transform coding at high bit-rates (i.e., the residuals are directly quantized).
We also developed a research tool that provides data from intermediate stages of the HEVC. The data mainly include motion-compensation residuals, motion vector data, block and frame types, and bit-budget of components. It is formatted in a structure suitable for easy usage in Matlab for future research projects.
Perlin City - Procedural 3D City Generation Project
Yaron Honen and Boaz Sternfeld
Based on a known method of creating a single random city block using perlin noise, we have created a method of procedurally generating an infinite city without the intervention of human designer.
Perline city is deterministic, realistic and beautiful.
To achieve best performance possible, we used object pooling, separation of building objects upon multiple frames using coroutines, detecting when to create and destroy and addition of detailed objects that are displayed and hidden based on distance
This is a virtual reality simulation of a vehicle with a camera that uses visual looming cue to navigate and avoid obstacles.
We visualized the looming in different ways and compared 2 methods for looming calculation.
One using ranges between the camera and the obstacles, and the other using the temporal change of the texture density of the obstacle.
Rate ditorsion optimized tree structures for image compression
Moshiko Elisof, Sefi Fuchs
1D and 2D Signals compression using improved tree coding, exploiting similarities of dyadic blocks by merging them into one tree leaf, allowing for adaptive tree-structure.
The studied algorithms compensate for an inherent issue in standard tree-based signal coding - squared blocks which limit the ability to reduce representation bit-cost.
Rate-Distortion Optimized Tree-Structured Compression Algorithms for Piecewise Polynomial Images by Rahul Shukla, Member, IEEE, Pier Luigi Dragotti, Member, IEEE, Minh N. Do, and Martin Vetterli, Fellow, IEEE .
Image compression via improved quadtree decomposition algorithms by E. Shusterman and M. Feder
CutOutRL - Visualizing Neural Networks with Scribbles
Yonatan Zarecki ,Ziv Izhar
Deep neural networks (DNNs) have been very successful in recent years, achieving state-of-the-art results in a wide range of domains, such as voice recognition, image segmentation, face recognition and more. In addition, reinforcement-learning (RL) training methods combined with DNN models (deep RL) have been able to solve a wide variety of games, from PONG to Mario, purely by looking at the pixel values of the screen.
Various “games” have been proposed for challenging neural networks, testing their capacity to learn complex tasks. Some tasks are designed to give us human insight about the way the model operates.
In this project, we challenged a deep RL model with the task of segmenting an image using scribbles. We force it to achieve good segmentations by using scribble-based segmentation in a way similar to humans. We hope to gain insight on the way the network does segmentation by looking at the scribbles it generates.
This is an open source project that intended to help people to create autonomous drone missions that operate with a pixhawk controller.
The project is written in C++ and Python in order to enable fast image processing and operating the drone in real time.
The project also includes built in missions. Our goal was to fly to a specific GPS location, scan the area for a bullseye target and land on the center of the target.
You can use the framework to create your own missions. The framework includes an API that helps to stream live video over wifi or/and record the video to file.
We’ve created this project in the Geomatric Image Processing lab at the Technion. Our goal was to create a simple framework to manage simple and complex missions represented by state machine
Read more in our github page or see the project Report and final presentation
Omri Azenkot and Boaz Sterenfeld and Yaron Honen
VR Newsroom is an experiment in browsing online news in virtual reality. It explores ways that VR can be used to facilitate discovery and exploration of large amounts of content. Online news was chosen for the content of the experiment because of its dynamic nature. Live APIs were chosen so that every time VR Newsroom is loaded different content is displayed. Custom RSS/ATOM feeds are also supported.
Virtual Reality interior designer
Frenkel Eduard and Salevich Alexander
Mata Sela and Yaron Honen
Our project's goal is to allow a user to experience a sense of presence in a room while designing it.
We intend to build a designing tool using virtual reality which will allow to design a room space while being present inside a virtual room;
The tool will allow the user to design a room interior and be able to experience his work product at the same time.
Our solution will allow to design better suited environment for the customer by letting the designer the ability to show the customer the final design and get his feedback.
DeepFlowers - Online Flower Recognition using Deep Neural Networks
Yonatan Zarecki and Ziv Izhar
These days, it seems like everyone has their own smartphone and that
internet connection is available everywhere even in the most distant
corners of nature reserves. A challenging task for nature lovers is the
task of flower recognition, even with heavy big and heavy flower guide
books it is hard to identify each flower species exactly, and for
amateurs finding anything is using these guides can be a monumental task
by itself, Differences between flowers species can be very subtle, and
not easy to detect even for an expert's eye. Another challenge a flower
classifier has to face is the sheer amount of flowers in the world, or
in a specific country.
In this project we try and harvest the
power of deep convolutional neural networks (CNNs) for our recognition
task which have proven to be successful in similar tasks, and using data
given to us by Prof. Avi Shmida of the Hebrew University, build a
flower recognizer with an open online API for all to use.
Tom palny, Shani Levi and Nurit Devir
Yaron Honen and Boaz Sternfeld and Omri Azencot and Hagai Tzafrir
Our system consists of three main parts:
The first one is to receive a 3-dimensional matrix represents the MRI scan of the brain. We used Matlab in order to create 2-dimensional images from the given matrix. Each image was saved as a PNG file and represent a specific slice of the brain.
The second part is to load the images into Unity and create a 3-dimensional object from them. In Order to build this object, we used the Ray Marching algorithm.
The last part was to implement the ability to present the object in virtual reality using HTC vive and allow features which will give the user the feeling of the 3-dimensional object.
Our project supports the following features:
• Rotating – rotate the brain using the handheld controller.
• Cutting – Cut the brain along the three X, Y and Z axes.
• Zoom – zoom in and zoom out.
• Reset button – turn back to the initial model.
• Masking – emphasize different parts of the brain according to the user choice: The user can choose one of the two options representing the parts he wants – color it or remove the unwanted parts and see only the wanted one. After using the presentation way, the user can choose the specific part in which he is interested.
In recent years, the introduction of deep reinforcement learning allowed rapid progress in the pursuit after implementing general AI. One of the long-standing challenges withholding further progress is designing an agent that operates in a hierarchical manner with temporal abstractions over its actions. We present a system which disassembles the learning into multiple sub-skills without external assistance. The system consists of a deep recurrent network which learns to generate action sequences from raw pixels alone, and implicitly learns structure over those sequences. We test the model on a complex 3D first person shooter game environment to demonstrate its effectiveness.
Ksenia Kaganer , Dima Trushinand Adi Mesika
Boaz Sternfeld and Yaron Honen
We developed a 3D Anatomic learning application.
Our application assist you in the learning process by creating a realistic Virtual Reality environment.
You can explore all the human body parts in a very detailed level.
Navigate between different body layers, e.g. skin, muscles, bones, internal organs etc. and see all the terminology names of each body part.
In addition, you can walk around the body naturally, have a look at the body from every aspect you want and holding a VR plane that slice the body and get a different anatomic cuts.
Ksenia Kaganer,Eran Tzabar
yaron honen and Avi Parush and Maayan Efrat
It is widely recognized that in celiac disease, to learn and adhere to
the gluten free diet is essential for ensuring a good quality of life.
It is important that the education process adopt strategies to motivate
and make the learning effective, particularly for children and
In this context, new technologies can help make the learning process more engaging.
The main idea is to use a game approach in order to make the learning and training process more engaging and intuitive.
Puppify - Automatic Generation of Planar Marionettes from Frontal Images
Elad Richardson and Gil Ben-Shachar
Anastasia Dubrovina and Aaron Weltzer
In this project, we propose a method for fully automating the body segmentation process, thus enabling a wide variety of consumer and security applications and removing the friction caused by manual input. The process starts with a deep convolutional network, used to localize body joints, which are refined and stabilized using Reverse Ensembling and skin tone cues. The skeletal pose model is then exploited to create "auto-scribbles": automatically generated foreground/background scribble masks that can be used as inputs for a wide range of segmentation algorithms to directly extract the subject's body from the background. Simple segmentation aware cropping produces individual body part crops which can be used to generate a planar marionette for repositioning and animation.
Sapir Eltanani and Simona Gluzman
3D game which relives the experience
of playing the old well known game 'Snake', but this time in a three
dimentional world in VR. The game is developed for the leap motion and
oculus rift devices.
The aim remains the same, the player has to
eat the most of the flying foods to earn points. Every eaten food grows
the snake longer, but once the player touches the objects in the world
around him or the snake's body he loses.
Imae Segmentation Using Multi-Region Active Contours
Chen Shapira and Tamir Segev
Our project focused on the Multi-region Active Contours with a single Level-Set function
method. This method allows quick & accurate image segmentation on 2d and 3d
images. This is done by dividing the image into multiple regions by calculating a single
nonnegative distance function, which is easily extracted using the Voronoi Implicit Interface
Augmented Reality in Road Navigation
In this work, we propose a vision-based solution for globally localizing
vehicles on the road using a single on-board camera and exploiting the
availability of priorly geo-tagged street view images from the
surrounding environment together with their associated local point
clouds. Our approach is focused on the integration of image-based
localization into a tracking and mapping system in order to provide
accurate and globally-registered 6DoF tracking of the vehicle’s position
at all times
Noam Yogev, Roee Mazor, Efi Shtain, Vasily Vitchevsky, Sergey buh, Alex Bogachenko and Daniel Joseph
We built a platform to be used to create
autonomous indoor flight capable drones and implemented a system based
on computer vision to utilize and demonstrate this platform by
performing four tasks, as described by a national contest sponsored by
the Pearls of Wisdom voluntary association. Currently, such platforms
are researched from commercial and academic points of view, but no
finished products have been released.
The computer vision part of the project is responsible for
communicating with the quadcopter’s navigation module, in order to guide
the quadcopter to the next target, identify key objects and trigger the
execution of various required auxiliary actions.
The software will run on an ARM-based companion computer running
Linux, mounted on the drone. It will receive images from one or multiple
high speed cameras on the quadcopter and ROS messages from the
navigation module running on the same computer. The OpenCV library will
be used to implement the required functionalities.
The need for real-time 3D reconstruction is becoming more and more
apparent in today's world. Depth Sensors are being marketed today in
consumer laptops and tablets. In the near future we expect an increase
in availability of mobile devices with depth sensors, and therefore also
a need for highly efficient real-time 3D reconstruction methods. Our
project's goal is to enable these devices to preform 3D reconstruction
in real-time. Our solution uses the input from a moving depth sensor to
estimate the camera position and build a 3D model. The implementation
harnesses the GPU to achieve real time preformace while taking into
account the limitations of mobile devices and putting a strong emphasis
on optimizations throughout the pipeline.
Automatic 3D Face Printing
Hila Ben-Moshe and David Gelbendorf
Developing automatic process for building a 3D face model from a
GIP facial video file. We first use Viola Jones Face Detection
Algorithm to detect which frame should we choose from the video.
Detection of face features and the movement of face are taken under
consideration when choosing the best frame automatically. Than we
process the selected frame, including automatic choose of bounding box
and with fixing missing eyebrows. Finally, we write the desired model to
STL format, a known format to print in 3D.
Image Segmentation and Matting in Realtime on a Mobile Device
In the our project we've implemented a scribble-based algorithm for
extracting object from natural photos and pasting them seamlessly into a
different background under the constrains of a mobile device
The algorithm was first developed and tested on a personal computer with
the help of openCV’s C++ libraries and was then ported to Android using
the Native Development Kit.
We used the Android Software Development Kit in order to wrap the
algorithm in a user friendly interface to create an application that
anybody can use.
Rigid ICP Registration with Kinnect
Choukroun Yoni and Semmel Elie
The main goal of the first part of the project was to perform an
Iterative Closest Point registration on two depth maps obtained using
the Kinect depth sensor in C++ on the windows platform.
The other purposes of this first part was to learn how to integrate
alone big libraries (dynamic or not) to the project and to handle with
the difficulties of implementing an algorithm on the different classes
of the libraries whom do not match necessarily one with the other.
The second part of the project was to bond, two by two with the
precedent algorithm, different scan frames get by the Kinect with the
help of its motor to get a whole body depth image.
Eng. Alon Zvirin and Eng. Yaron Honen
בניית ארגז כלים לאנליזה ראשונית של מודל\משטח פנים תלת-מימדי בניסויים
קליניים עבור הרופאים. ארגז הכלים יאפשר לרופאים בין היתר להציג מודל
פנים מיטבי של הנבדק לאחר סריקת וידאו של פניו, למצוא נקודות עניין על
גבי המודל\משטח ולמדוד מרחקים גאודטים בין הנקודות השונות לפי שיקול דעתו
של הרופא. האפליקציה מקבלת קובץ וידאו תלת מימדי ממצלמת גיפ ויודעת
באופן אוטומטי לבחור את תמונת העומק הטובה ביותר מתוך סרט הוידאו. בתמונת
העומק יבחרו באופן אוטומטי שלוש נקודות במרכז הפנים בעזרת ויולה וג'ונס.
בעזרת ASM ימצאו באופן אוטומטי 68 נקודות בכל משטח הפנים המוצג. אלג'
שפותח ע"י הסטודנט יודע באופן אוטומטי לחלוטין לבחור מתוך כל הנקודות
12 נקודות מרכזיות (5 בפה, 4 בעיניים, 3 באף). באמצעות הממשק שפותח ניתן
לראות את המסלולים על גבי הפנים ,בין הנקודות הנבחרות ולקבל באופן מיידי
את המרחק הגאודטי ביניהן (fast marching). בנוסף, ניתן לשמור את כל הנתונים
(נקודות,מרחקים,מס' תמונה, הערות וכו') לקובץ אקסל ולעלות אותם במועד
Printed circuit boards detection and image analysis
Giorgio Tabarani and Roi Divon
In recent years and with recent events in the country, arose the
demand to be able to identify printed circuit boards taken by the
Israeli Police in crime scenes in order to connect between cases and
identify the source of these boards hoping to avoid similiar and
unpleasant incidents in the future.
The police takes snapshots of printed circuit boards from every
crime scene, mostly distorted circuits due to burns or fractures, and
tries to identify their origin from manual inspection and by guessing.
In this project we were asked to develop a basic system which can
perform the aforementioned identification automatically given a picture
of a shred and a database of pictures of circuit boards which usually
appear in crime scenes.
Generating 3D Colored Face Model Using a Kinect Camera
Rotem Mordoch and Nadine Toledano and Ori Ziskind
Matan Sela and Yaron Honen
The constant development of cheap depth cameras, together with the
ongoing integration of them on mobile devices, offers the potential of
many new and exciting applications covering various of different fields.
This includes personal everyday use, commercial objectives and medical
solutions. In our project we propose a system which allows the user to
easily create a colored 3D facial model of its own. The objective of
this project is to build a user-friendly system for generating a 3D
colored facial model. The solution we offer combines open source
techniques for face detection in an image and a 3D reconstruction
algorithm. We integrate these techniques to create a common algorithm
which produces our goal. The system we have built uses depth camera
stream to capture a subject’s face on each frame, and uses this
information to generate a high quality colored 3D facial model. We
demonstrate our results and optimizations to the solution, and offer
possible future opportunities to continue our work.
Solving Simultaneous Linear Equations Using GPU
Oriel Rosen and Haviv Cohen
Image processing tends to demand highly complicated computations.
Though some programming languages (as Matlab) are very comfortable for mathematical usage,
they are less than ideal in terms of performance, leading to programs which run far too much time.
The solution to that problem is to use “stronger” tools at the choke-points of the computation.
By programming with low-level languages and by using parallelism,
we can drastically improve our program’s performance.
Freehand Voxel Carving Scanning on a Mobile Device
3D scanners are growing in their popularity as many new
applications and products are becoming a commodity. These applications
are normally tethered to a computer and/or require expensive and
specialized hardware. Our goal is to provide a 3D scanner which uses
only a mobile phone with a camera. We consider the problem of computing
the 3D shape of an unknown, arbitrarily shaped scene from multiple color
photographs taken at known but arbitrarily distributed viewpoints using
a mobile device. The estimated camera orientation and position in 3D
space obtained from publicly available SLAM libraries permits us to
perform a 3D reconstruction of the observed objects. We demonstrate that
it is possible to achieve a good 3D reconstruction on a mobile device.
3D Image Fusion using ICP
Alon Zvirin and Guy Rosman and Yaron Honen
This project deals with the ICP
algorithm and uses it to create a complete three-dimensional model of a
rigid object. First, a wrapper for an ICP algorithm included in PCL
library which fuses 3D images taken by GIP Technion laboratory camera
was written, and its running parameters were optimized. Afterwards,
various improvements were implemented for an ICP algorithm using
distance function of point to surface, parameters were examined, and
fusion algorithm was written. Finally, the program was integrated to the
GUI of GIP Technion laboratory
Shahar Sagiv, Omri Panizel, Loui Diab and Waseem Ghraye
We created a new type of game, which combines the competitive aspect of
the Bingo game with the great fun of solving a puzzle. This game is
being played simultaneously between 4 players (on 4 different devices)
who compete by solving the given puzzle. Every player can see the real
time progress of the other players with a map showing their boards.
Every Puzzle has a title name and the user have a choice to stop solving
the puzzle when he recognize the image, and try to guess the image
title. This gives the game educational value, as the players learn to
recognize places, animals, and celebrities.
Anna Ufliand and Sergey Yusufov
המטרה העיקרית של הפרויקט היתה פיתוח מערכת אשר תהיה מסוגלת להתתממשק עם
מכשיר לבדיקות רפואיות, אשר פותח בפקולטה להנדסה ביו-רפואית, לקרוא ולהציג
את הנתונים מהמכשיר הנ"ל בצורה נוחה, יעילה וכך שיהיה ניתן לגשת לנתונים
האלא ממכשירים שונים.כאחד הקריטריונים החשובים עבורינו היה לפתח מערכת אשר
תאפשר גמישות מירבית. למכשיר ביו-רפואי יכולות להיות הרבה קונפיגורציות
שונות, זו בעצם פלטפורמה המאפשרת הרכבת חיישנים שונים לביצוע בדיקות שונות.
היה לנו חשוב להתייחס לכך ולפתח מערכת אשר לא תגביל את סוג וכמות החיישנים
אשר בעזרתם מבצעים בדיקות.
Artists, designers and architects use imbalance to their advantage to produce surprising and elegant designs.
The balancing process is challenging when manipulating geometry in a 3D modeling software,
since volumes are only represented by their boundaries.
Our goal is to modify volume shape such that once printed, the model stands.
To do so, we manipulate the inner voids and sometimes it will not be enough and we will also consider deforming the model.
These two manipulations change the mass distribution and thus the center of mass position.
Solving Classification Problem on Hyperspectral Images
Talor Abramovich and Oz Gavrielov
Hyperspectral Imaging is a spectral imaging method, which includes
bands from the visible light as well as infra red. Unlike the 2D color
images, which only use red, green and blue, hyperspectral image includes
a third dimension of spectrum. This information can be used to classify
the objects in the image, and to define the difference between asphalt,
plants and water. It could even show the difference between real leaf
and a plastic one. In our project we used several classification
algorithms, including KNN, PCA and KSVD to classify four hyperspectral
images. We compared the results and found which algorithm gives the best
classification and which is the most efficient..
Quaternion K-SVD for Color Image Denoising
In this work, we introduce the use of Quaternions within the field of sparse and redundant representations. The Quaternion space is an extension of the complex space, where each element is composed of four parts – a real-part and three imaginary parts. The major difference between Quaternion space H and the complex C space is that the Quaternion space has non-commutative multiplication. We design and implement Quaternion variants of state-of-the-art algorithms OMP and KSVD. We show various results, previously established only for the real or complex spaces, and use them to devise the Quaternion K-SVD algorithm, nicknamed QK-SVD.
Structured Light Based 3D Reconstruction with Priors
Itamar Talmi and Ofir Haviv
פרויקט זה בא לבחון שימוש באלגוריתם PCA לצורך פתיחת נעילת מכשיר אנדרואיד
על ידי זיהוי פנים. פרויקט זה מדגים כיצד ניתן, ע"י למידה של פני האדם
כלשהו בתאורות שונות, בניית בסיס PCA של אותו אדם, ובעזרת בסיס זה, לאמת
מישהו שמנסה להזדהות בעת פתיחת נעילה של מכשיר אנדרואיד.
CamPong – Smartphone PONG using Camera and Built-in Projector
Nofar Carmeli and Rom Herskovitz
In this project, we introduce the use of a mobile phone, equipped with a
camera and a projector, to allow real time hand detection. We present a
demo of an interactive pong game that is controlled by the players’
natural hand movements during the game. To the best of our knowledge
this is the first known use of a commodity cellular phone that uses an
inbuilt projector to perform real time structured light projections
coupled with real time image processing.
Nurit Schwabsky and Vered Cohen
OpenFusion is an implementation of Microsoft’s KinectFusion system. This
system enables real-time tracking and reconstruction of a 3D scene
using a depth sensor. A stream of depth images is received from the
camera and compared to the model built so far using the Iterative
Closest Point (ICP) algorithm to track the 6DOF camera position. The
camera position is then used to integrate the new depth images into the
growing volumetric model, resulting in an accurate and robust
reconstruction. The reconstructed model is adapted according to dynamic
changes in the scene without losing accuracy.
3D Stereo Reconstruction Using iPhone Devices
Ron Slossberg and Omer Shaked
Stereo Reconstruction is a common method for obtaining depth information
about a given scene using 2D images of the scene taken simultaneously
by two cameras from different views. This process is done by finding
corresponding objects which appear in both images and examining their
relative positions in the images, based on previous knowledge of the
internal parameters of each camera and the relative positions of both
cameras. This method relies on the same basic principle that enables our
eyes to perceive depth.