Search: [ml]

LLMs Outperform Reinforcement Learning- Meet SPRING: An Innovative Prompting Framework for LLMs Designed to Enable in-Context Chain-of-Thought Planning and Reasoning

SPRING is an LLM-based policy that outperforms Reinforcement Learning algorithms in an interactive environment requiring multi-task planning and reasoning. A group of researchers from Carnegie Mellon University, NVIDIA, Ariel University, and Microsoft have investigated the use of Large Language Models (LLMs) for understanding and reasoning with human knowledge in the context of games. They propose a two-stage approach called SPRING, which involves studying an academic paper and then using a Question-Answer (QA) framework to justify the knowledge obtained. More details about SPRING In the first stage, the authors read the LaTeX source code of the original paper by Hafner (2021)

ml · paper · rl · llm

May 28, 2023 at 7:11:41 PM EDT * · permalink

·

https://www.marktechpost.com/2023/05/28/llms-outperform-reinforcement-learning-meet-spring-an-innovative-prompting-framework-for-llms-designed-to-enable-in-context-chain-of-thought-planning-and-reasoning/?amp

How does fine tuning really work? - API - OpenAI Developer Forum

In pre-trained transformer models (GPT), fine-tuning occurs in the Decoder. The decoder is responsible for generating the output text based on the representation created by the encoder. Like the encoder, the decoder is typically made up of multiple layers of multi-head self-attention and feed-forward neural networks.

ml

May 12, 2023 at 12:18:58 PM EDT * · permalink

·

https://community.openai.com/t/how-does-fine-tuning-really-work/39972/2

Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models — TOGETHER

Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned and chat models.

ml

May 6, 2023 at 4:19:12 PM EDT * · permalink

·

https://www.together.xyz/blog/redpajama-models-v1

Stanford researchers terminate ChatGPT-like OpenAI two months after launch

Alpaca was developed on Meta AI's LLaMA 7B model and generated training data with a method known as self-instruct

I think one of the safest ways to move forward with this technology is to make sure that it is not in too few hands."

ml

April 16, 2023 at 2:28:47 PM EDT * · permalink

·

https://www.geo.tv/latest/480032-stanford-researchers-terminate-chatgpt-like-openai-two-months-after-launch

Feature Visualization

How neural networks build up their understanding of images

ml · explainer

April 14, 2023 at 3:49:22 PM EDT * · permalink

·

https://distill.pub/2017/feature-visualization/

Statement from the listed authors of Stochastic Parrots on the “AI pause” letter

The current race towards ever larger "AI experiments" is not a preordained path where our only choice is how fast to run, but rather a set of decisions driven by the profit motive. The actions and choices of corporations must be shaped by regulation which protects the rights and interests of people.

ml

April 1, 2023 at 5:26:15 PM EDT * · permalink

·

https://www.dair-institute.org/blog/letter-statement-March2023