Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
PCA analysis of image augmentation techniques used in the state of the art image classification models See full article here
Published:
Save classification labels and top confidences in a custom layer using Keras See full article here
Published:
Code Snippet for Linear Algebra and Computer Vision See full article here
Published:
This story shows how to visualize pre-trained BERT embeddings in Tensorflow’s Tensorboard Embedding Projector. The story uses around 50 unique sentences and their BERT embeddings generated with TensorFlow Hub BERT models. See full article here
Published:
Comparing Tokenizer vocabularies of State-of-the-Art Transformers (BERT, GPT-2, RoBERTa, XLM) See full article here
Published:
Practical advice analysing the first month of a Towards Data Science writer See full article here
Published:
This story shows a simple usage of the BERT [1] embedding using TensorFlow 2.0. As TensorFlow 2.0 has been released recently, the module aims to use easy, ready-to-use models based on the high-level Keras API. The previous usage of BERT was described in a long Notebook implementing a Movie Review prediction. In this story, we will see a simple BERT embedding generator using Keras and the latest TensorFlow and TensorFlow Hub modules. All codes are available on Google Colab. See full article here
Published:
My previous story describes BLEU as the most used metric for Machine Translation (MT). This one aims to introduce the Conferences, Datasets and Competitions where you can compare your models with the State-of-the-art, you can collect knowledge from and where you can meet researchers from the field. See full article here
Published:
An important reason for using contextualised word embeddings is that the standard embeddings assign one vector for every meaning of a word, however, there are multiple-meaning words. The hypothesis is that the use of the context can solve the problem of categorizing multiple-meaning words (homonyms and homographs) into the same embedding vector. In this story, we will analyse whether BERT embeddings can be used to classify different meanings of a word to prove that contextualised word embeddings solve the problem. See full article here
Published:
This story is an overview of the field of Machine Translation. The story introduces several highly cited literature and famous applications, but I’d like to encourage you to share your opinion in the comments. The aim of this story is to provide a good start for someone new to the field. It covers the three main approaches of machine translation as well as several challenges of the field. Hopefully, the literature mentioned in the story presents the history of the problem as well as the state-of-the-art solutions. See full article here
Published:
In this story, we will visualise the word embedding vectors to understand the relations between words described by the embeddings. This story focuses on word2vec [1] and BERT [2]. To understand the embeddings, I suggest reading a different introduction as this story does not aim to describe them. See full article here
Published:
The goal of this story is to understand BLEU as it is a widely used measurement of MT models and to investigate its relation to BERT. See full article here
Published:
Minimum spanning tree using Numpy array operations. See full article here
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in XIV. Magyar Számítógépes Nyelvészeti Konferencia, 2018
Hyphenation algorithms are the computer based ways of syllabification and mostly used in typesetting, formatting documents as well as text-to-speech and speech recognition systems. We present a deep learning approach to automatic hyphenation of Hungarian text. Our experiments compare feed forward, recurrent and convolutional neural network approaches.
Recommended citation: Németh, G. D., Ács, J. (2018). "Hyphenation using deep neural networks" XIV. Magyar Számítógépes Nyelvészeti Konferencia http://negedng.github.io/files/2018-Hyphenation.pdf
Published in Transactions on Machine Learning Research, 2022
Federated learning (FL) has been proposed as a privacy-preserving approach in distributed machine learning. A federated learning architecture consists of a central server and a number of clients that have access to private, potentially sensitive data. Clients are able to keep their data in their local machines and only share their locally trained model’s parameters with a central server that manages the collaborative learning process. FL has delivered promising results in real-life scenarios, such as healthcare, energy, and finance. However, when the number of participating clients is large, the overhead of managing the clients slows down the learning. Thus, client selection has been introduced as a strategy to limit the number of communicating parties at every step of the process. Since the early naive random selection of clients, several client selection methods have been proposed in the literature. Unfortunately, given that this is an emergent field, there is a lack of a taxonomy of client selection methods, making it hard to compare approaches. In this paper, we propose a taxonomy of client selection in Federated Learning that enables us to shed light on current progress in the field and identify potential areas of future research in this promising area of machine learning.
Recommended citation: Németh, G. D., Lozano, M. A., Quadrianto, N., & Oliver, N. (2022). " A Snapshot of the Frontiers of Client Selection in Federated Learning" Transactions on Machine Learning Research http://negedng.github.io/files/2022-Snapshot.pdf
Undegraduate course, Budapest University of Technology and Economics, Department of Computer Science and Information Theory, 2016
This is an undergraduate teaching assistant experience. The goal of the subject is to acquire the fundamental mathematical knowledge (in the area of linear algebra and number theory) necessary for software engineering studies.
High school course, ELTE Radnóti Miklós School, 2016
This is a one year experience as a teacher for 11-12th grade students. The course is a specialisation to learn the basics of programming with the help of ColoBot, Processing and Java.
Undegraduate course, Budapest University of Technology and Economics, Department of Computer Science and Information Theory, 2017
This is an undergraduate teaching assistant experience. The goal of the subject is to acquire the fundamental mathematical knowledge (in the area of linear algebra and number theory) necessary for software engineering studies.
High school course, Fazekas Mihály School, 2017
This is an advanced level programming course for high school students. Students of this course competed in various levels of porgramming championships.
Undegraduate course, Budapest University of Technology and Economics, Department of Computer Science and Information Theory, 2017
This is an undergraduate teaching assistant experience. The goal of the subject is to acquire the fundamental mathematical knowledge (in the area of linear algebra and number theory) necessary for software engineering studies.
Online course, Udemy, 2018
This is an online course teaching programming in Hungarian. Originally, the course was free but I realized that it generates a lack of motivation in the students. Therefore, I changed it to the minimal amount possible in Udemy.