Personal profile
About Me
I'm an Assistant Professor at Bar-Ilan University, leading the Multimodal Lab.
My research focuses on multimodal problems, primarily generative, with challenges including modal alignment and efficient inference-time solutions.
I also have a strong interest in connecting ideas from cognitive science decision making to deep learning concepts, such as model perceptiveness, representation as comprehension, attention, and programming as system 2 problem solving.
I completed my postdoc with Prof. Lior Wolf at Tel Aviv University. Before that, I earned my PhD in Computer Science from the Technion, under the supervision of Prof. Tamir Hazan and Prof. Alexander G. Schwing from UIUC. My thesis focused on Cognitive Models in Deep Learning. You can find my thesis here.
My experience as a researcher in industry includes work on eBay's catalog (vision and language), Microsoft's Assistant (meeting insights, transcript-based), and at Spot for cloud workload optimization platform (time-series prediction). I currently serve as Chief Scientist at Aigency.ai, where I'm helping revolutionize the internet in the age of intelligent agents.
Research
- Fields of Interest
- Computer Vision
- Natural Language Processing
- Attention Models
- Multimodal Learning
- Multimodal models
- Large language models (LLMs)
- Generative AI
- Explainability
- Perceptual bias and plug-and-play contro
Fingerprint
- 1 Similar Profiles
Collaborations and top research areas from the last five years
-
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Yariv, G., Gat, I., Benaim, S., Wolf, L., Schwartz, I. & Adi, Y., 25 Mar 2024, Technical Tracks 14. Wooldridge, M., Dy, J. & Natarajan, S. (eds.). 7 ed. Association for the Advancement of Artificial Intelligence, p. 6639-6647 9 p. (Proceedings of the AAAI Conference on Artificial Intelligence; vol. 38, no. 7).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open Access25 Scopus citations -
AUDIOTOKEN: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
Yariv, G., Gat, I., Wolf, L., Adi, Y. & Schwartz, I., 2023, In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2023-August, p. 5446-5450 5 p.Research output: Contribution to journal › Conference article › peer-review
5 Scopus citations -
Discriminative Class Tokens for Text-to-Image Diffusion Models
Schwartz, I., Snæbjarnarson, V., Chefer, H., Belongie, S., Wolf, L. & Benaim, S., 2023, Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023. Institute of Electrical and Electronics Engineers Inc., p. 22668-22678 11 p. (Proceedings of the IEEE International Conference on Computer Vision).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
9 Scopus citations -
Zero-Shot Video Captioning by Evolving Pseudo-tokens
Tewel, Y., Shalev, Y., Nadler, R., Schwartz, I. & Wolf, L., 2023.Research output: Contribution to conference › Paper › peer-review
2 Scopus citations -
Describing Sets of Images with Textual-PCA
Hupert, O., Schwartz, I. & Wolf, L., 2022, Findings of the Association for Computational Linguistics: EMNLP 2022. Goldberg, Y., Kozareva, Z. & Zhang, Y. (eds.). Association for Computational Linguistics (ACL), p. 4537-4544 8 p. (Findings of the Association for Computational Linguistics: EMNLP 2022).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open Access