Introducing Command R+: Our new, most powerful model in the Command R family.

Cohere For AI - Guest Speaker: Kaushik Valmeekam, PhD Student

other

Date: May 01, 2024

Time: 5:00 PM - 6:00 PM

Location: Online

Abstract: We provide new estimates of an asymptotic upper bound on the entropy of English using the large language model LLaMA-7B as a predictor for the next token given a window of past tokens. This estimate is significantly smaller than currently available estimates in \cite{cover1978convergent}, \cite{lutati2023focus}. A natural byproduct is an algorithm for lossless compression of English text which combines the prediction from the large language model with a lossless compression scheme. Preliminary results from limited experiments suggest that our scheme outperforms state-of-the-art text compression schemes such as BSC, ZPAQ, and paq8h.

Bio: Kaushik Valmeekam is a doctoral candidate in Electrical Engineering at Texas A&M University, working under the guidance of Professor Krishna Narayanan. His research endeavors are situated at the intersection of Machine Learning and Wireless Communication. His recent works are on a variety of topics, including the text compression using Large Language Models (LLMs), in-context estimation using Transformers in wireless communications, and Over-the-Air Neural Group Testing. Currently, his interest lies in the exploration of Large Language Models from a compression perspective, aiming to enhance their efficiency and effectiveness.

Add event to calendar

Apple Google Office 365 Outlook Outlook.com Yahoo

Add to Google Calendar