32 private links
What are ML artifacts?
Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time-series corpus, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.
To better control for risk, we construct a novel machine learning based value factor and find that it outperforms existing value factors while earning less from risk and more from mispricings.
This work thus provides strong empirical evidence towards developing scaling laws for reinforcement learning.
We document return predictability from deep-learning models that cannot be explained by common risk factors or limits to arbitrage.
statistical arbitrage portfolios with graph clustering algorithms
Our online approach requires less memory as data is processed continuously. Moreover, our network learns from each data sample only once, significantly reducing energy use and making the process highly efficient.
"Unlike in CV and NLP, the field of time series lacks publicly accessible large-scale datasets."
The complaint lays out in steps why the plaintiffs believe the datasets have illicit origins — in a Meta paper detailing LLaMA, the company points to sources for its training datasets, one of which is called ThePile, which was assembled by a company called EleutherAI. ThePile, the complaint points out, was described in an EleutherAI paper as being put together from “a copy of the contents of the Bibliotik private tracker.” Bibliotik and the other “shadow libraries” listed, says the lawsuit, are “flagrantly illegal.”
With a new Fill-in-the-Middle paradigm, GitHub engineers improved the way GitHub Copilot contextualizes your code. By continuing to develop and test advanced retrieval algorithms, they’re working on making our AI tool even more advanced.
Source Latent Space Podcast Ep. 2: Why you are holding your GPUs wrong OpenAI just rollicked the AI world yet again yesterday — while releasing the long awaited ChatGPT API, they also priced it at $2 per million tokens generated, which is 90% cheaper than the text-davinci-003 pricing of the “GPT3.5” family. Their blogpost on how they did it is vague: Through a series
A "Copilot for X" guide from the team that built the first real Copilot competitor!
The remarkable zero-shot learning capabilities demonstrated by large foundation models (LFMs) like ChatGPT and GPT-4 have sparked a question: Can these models autonomously supervise their behavior or other models with minimal human intervention? To explore this, a team of Microsoft researchers introduces Orca, a 13-billion parameter model that learns complex explanation traces and step-by-step thought processes from GPT-4. This innovative approach significantly improves the performance of existing state-of-the-art instruction-tuned models, addressing challenges related to task diversity, query complexity, and data scaling. The researchers acknowledge that the query and response pairs from GPT-4 can provide valuable guidance for student models. Therefore,