ads

New best story on Hacker News: Ask HN: Why did Python win?

Ask HN: Why did Python win?
346 by MatthiasPortzel | 599 comments on Hacker News.
I started programming in ~2013 in JavaScript. I’ve since learned and tried a handful of languages, including Python, but JavaScript was always my favorite. Just within the last year I learned Ruby, and I was blown away by how fun and easy to use it is. At the present time, I’m starting all my new projects in Ruby. My impression is that in the ‘00s, Python and Ruby were both relatively new, dynamically typed, “English-like” languages. And for a while these languages had similar popularity. Now Ruby is still very much alive; there are plenty of Rails jobs available and exciting things happening with Ruby itself. But Python has become a titan in the last ten years. It has continued to grow exponentially and Ruby has not. I can guess as to why (Python’s math libraries, numpy and pandas make it appealing to academics; Python is simpler and possibly easier to learn; Rails was so popular that it was synonymous with Ruby) but I wasn’t paying attention at that time. So I’m interested in hearing from some of the older programmers about why Ruby has stalled out and Python has become possibly the most popular programming language (when, in my opinion, Ruby is the better language).

New best story on Hacker News: 111,111.1 meters is reliably 1 degree of latitude

111,111.1 meters is reliably 1 degree of latitude
331 by mholt | 258 comments on Hacker News.


New best story on Hacker News: Slack’s migration to a cellular architecture

Slack’s migration to a cellular architecture
392 by serial_dev | 234 comments on Hacker News.


New best story on Hacker News: Show HN: Open-source obsidian.md sync server

Show HN: Open-source obsidian.md sync server
381 by acheong08 | 135 comments on Hacker News.
https://ift.tt/Prq5W0k Hello HN, I'm a recent high school graduate and can't afford $8 per month for the official sync service, so I tried my hand at replicating the server. It's still missing a few features, such as file recovery and history, but the basic sync is working. To the creators of Obsidian.md: I'm probably violating the TOS, and I'm sorry. I'll take down the repository if asked. It's not ready for production and is highly inefficient; Not competition, so I hope you'll be lenient.

New best story on Hacker News: E-ink is so Retropunk

E-ink is so Retropunk
470 by raisjn | 253 comments on Hacker News.


New best story on Hacker News: Web scraping for me, but not for thee

Web scraping for me, but not for thee
443 by mhb | 121 comments on Hacker News.


New best story on Hacker News: Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B

Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B
394 by rushingcreek | 136 comments on Hacker News.
Hi HN, We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset. The CodeLlama models released yesterday demonstrate impressive performance on HumanEval. - CodeLlama-34B achieved 48.8% pass@1 on HumanEval - CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens. Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples. The methodology is: - For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters. - A match was identified if any sampled substring was a substring of the processed training example. For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report. Presented below are the pass@1 scores we achieved with our fine-tuned models: - Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval - Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval Note on GPT-4 According to the official technical report in March, OpenAI reported a pass@1 score of 67% for GPT-4's performance on HumanEval. Since then, there have been claims reporting higher scores. However, it's essential to note that there hasn't been any concrete evidence pointing towards an enhancement in the model's coding abilities since then. It's also crucial to highlight that these elevated figures lack the rigorous contamination analysis that the official statistic underwent, making them less of a reliable comparison. As a result, we consider 67% as the pass@1 score for GPT-4. Download We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results. Phind-CodeLlama-34B-v1: https://ift.tt/DIWxFJn Phind-CodeLlama-34B-Python-v1: https://ift.tt/7tqJsGl We'd love to hear your thoughts! Best, The Phind Team

New best story on Hacker News: Hugging Face raises $235M from investors including Salesforce and Nvidia

Hugging Face raises $235M from investors including Salesforce and Nvidia
365 by immortal3 | 195 comments on Hacker News.


New best story on Hacker News: OpenTF Announces Fork of Terraform

OpenTF Announces Fork of Terraform
541 by cube2222 | 173 comments on Hacker News.


New best story on Hacker News: Factorio: Space Age

Factorio: Space Age
569 by haunter | 136 comments on Hacker News.


New best story on Hacker News: The complete sequence of a human Y chromosome

The complete sequence of a human Y chromosome
404 by birriel | 172 comments on Hacker News.


New best story on Hacker News: Hacker News Guidelines

Hacker News Guidelines
391 by tonmoy | 362 comments on Hacker News.


New best story on Hacker News: Show HN: LLMs can generate valid JSON 100% of the time

Show HN: LLMs can generate valid JSON 100% of the time
526 by remilouf | 165 comments on Hacker News.
Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming. Recently we came up with a fast way to generate text that matches a regex ( https://ift.tt/7OPjnS8... ). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary. Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water. From there it was only a small leap to be able to generate text that follows a JSON schema ( https://ift.tt/6vSd4Ui ), or is parseable into a Pydantic model ( https://ift.tt/QGvi5gE ). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable. I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too. I look forward to feedback, bug reports, feature requests and discussions! Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar https://ift.tt/epGhtcM

New best story on Hacker News: How a startup loses its spark

How a startup loses its spark
555 by imadj | 271 comments on Hacker News.


New best story on Hacker News: 80% of bosses say they regret earlier return-to-office plans

80% of bosses say they regret earlier return-to-office plans
702 by pg_1234 | 587 comments on Hacker News.


New best story on Hacker News: Judge sends Sam Bankman-Fried to jail over witness tampering

Judge sends Sam Bankman-Fried to jail over witness tampering
583 by coloneltcb | 604 comments on Hacker News.


New best story on Hacker News: Wendelstein 7-X: Gigajoule energy turnover generated for eight minutes

Wendelstein 7-X: Gigajoule energy turnover generated for eight minutes
510 by greesil | 309 comments on Hacker News.


New best story on Hacker News: Squeeze the hell out of the system you have

Squeeze the hell out of the system you have
674 by sbmsr | 365 comments on Hacker News.


New best story on Hacker News: Azure ChatGPT: Private and secure ChatGPT for internal enterprise use

Azure ChatGPT: Private and secure ChatGPT for internal enterprise use
726 by taubek | 264 comments on Hacker News.


New best story on Hacker News: Postgres Language Server

Postgres Language Server
821 by kiwicopple | 101 comments on Hacker News.
hey HN. this is a Language Server[0] designed specifically for Postgres. A language server adds features to IDEs (VSCode, NeoVim, etc) - features like auto-complete, go-to-definition, or documentation on hover, etc. there have been previous attempts at adding Postgres support to code editors. usually these attempts implement a generic SQL parser and then offer various "flavours" of SQL. This attempt is different because it uses the actual Postgres parser to do the heavy-lifting. This is done via libg_query, an excellent C library for accessing the PostgreSQL parser outside of the server. We feel this is a better approach because it gives developers 100% confidence in the parser, and it allows us to keep up with the rapid development of Postgres. this is still in early development, and mostly useful for testers/collaborators. the majority of work is still ahead, but we've verified that the approach works. we're making it public now so that we can develop it in the open with input from the community. a lot of the credit belongs to pganalyze[1] for their work on libpg_query, and to psteinroe ( https://ift.tt/kBHz9Rc ) who the creator and maintainer. [0] LSP: https://ift.tt/97JvoCm [1] pganalyze: https://pganalyze.com/

New best story on Hacker News: Zoom terms now allow training AI on user content with no opt out

Zoom terms now allow training AI on user content with no opt out
698 by isodev | 264 comments on Hacker News.


New best story on Hacker News: Just normal web things

Just normal web things
535 by vitplister | 210 comments on Hacker News.