Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 21) Show all publications
Pena, F. J., Gonzalez, A. L., Pashami, S., Al-Shishtawy, A. & Payberah, A. H. (2022). Siambert: Siamese Bert-based Code Search. In: 34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022: . Paper presented at 34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022, 13 June 2022 through 14 June 2022. Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Siambert: Siamese Bert-based Code Search
Show others...
2022 (English)In: 34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022, Institute of Electrical and Electronics Engineers Inc. , 2022Conference paper, Published paper (Refereed)
Abstract [en]

Code Search is a practical tool that helps developers navigate growing source code repositories by connecting natural language queries with code snippets. Platforms such as StackOverflow resolve coding questions and answers; however, they cannot perform a semantic search through the code. Moreover, poorly documented code adds more complexity to search for code snippets in repositories. To tackle this challenge, this paper presents Siambert, a BERT-based model that gets the question in natural language and returns relevant code snippets. The Siambert architecture consists of two stages, where the first stage, inspired by Siamese Neural Network, returns the top K relevant code snippets to the input questions, and the second stage ranks the given snippets by the first stage. The experiments show that Siambert outperforms non-BERT-based models having improvements that range from 12% to 39% on the Recall@1 metric and improves the inference time performance, making it 15x faster than standard BERT models

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2022
Keywords
Codes (symbols), Natural language processing systems, Code search, Natural language queries, Natural languages, Neural-networks, Performance, Semantic search, Source code repositories, Semantics
National Category
Economics and Business
Identifiers
urn:nbn:se:ri:diva-60199 (URN)10.1109/SAIS55783.2022.9833051 (DOI)2-s2.0-85136132400 (Scopus ID)9781665471268 (ISBN)
Conference
34th Workshop of the Swedish Artificial Intelligence Society, SAIS 2022, 13 June 2022 through 14 June 2022
Available from: 2022-10-07 Created: 2022-10-07 Last updated: 2023-11-06Bibliographically approved
Bouguelia, M.-R., Nowaczyk, S. & Payberah, A. (2018). An adaptive algorithm for anomaly and novelty detection in evolving data streams. Data mining and knowledge discovery, 32(6), 1597-1633
Open this publication in new window or tab >>An adaptive algorithm for anomaly and novelty detection in evolving data streams
2018 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 32, no 6, p. 1597-1633Article in journal (Refereed) Published
Abstract [en]

In the era of big data, considerable research focus is being put on designing efficient algorithms capable of learning and extracting high-level knowledge from ubiquitous data streams in an online fashion. While, most existing algorithms assume that data samples are drawn from a stationary distribution, several complex environments deal with data streams that are subject to change over time. Taking this aspect into consideration is an important step towards building truly aware and intelligent systems. In this paper, we propose GNG-A, an adaptive method for incremental unsupervised learning from evolving data streams experiencing various types of change. The proposed method maintains a continuously updated network (graph) of neurons by extending the Growing Neural Gas algorithm with three complementary mechanisms, allowing it to closely track both gradual and sudden changes in the data distribution. First, an adaptation mechanism handles local changes where the distribution is only non-stationary in some regions of the feature space. Second, an adaptive forgetting mechanism identifies and removes neurons that become irrelevant due to the evolving nature of the stream. Finally, a probabilistic evolution mechanism creates new neurons when there is a need to represent data in new regions of the feature space. The proposed method is demonstrated for anomaly and novelty detection in non-stationary environments. Results show that the method handles different data distributions and efficiently reacts to various types of change. 

Keywords
Anomaly and novelty detection, Change detection, Data stream, Growing neural gas, Non-stationary environments, Adaptive algorithms, Big data, Intelligent systems, Neurons, Non-stationary environment, Novelty detection, Data mining
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-33882 (URN)10.1007/s10618-018-0571-0 (DOI)2-s2.0-85046792304 (Scopus ID)
Available from: 2018-05-30 Created: 2018-05-30 Last updated: 2020-08-03Bibliographically approved
Rahimian, F. & Payberah, A. (2017). DOIT WP3 report on predictive modeling and data insights: Version 5.0.
Open this publication in new window or tab >>DOIT WP3 report on predictive modeling and data insights: Version 5.0
2017 (English)Report (Other academic)
Series
SICS Technical Report, ISSN 1100-3154 ; 2017:06
National Category
Computer Sciences
Identifiers
urn:nbn:se:ri:diva-34281 (URN)
Available from: 2018-07-23 Created: 2018-07-23 Last updated: 2020-01-23Bibliographically approved
Rahimian, F., Payberah, A., Girdzijauskas, S., Jelasity, M. & Haridi, S. (2015). A distributed algorithm for large-scale graph partitioning. ACM Transactions on Autonomous and Adaptive Systems, 10(2), Article ID 12.
Open this publication in new window or tab >>A distributed algorithm for large-scale graph partitioning
Show others...
2015 (English)In: ACM Transactions on Autonomous and Adaptive Systems, ISSN 1556-4665, E-ISSN 1556-4703, Vol. 10, no 2, article id 12Article in journal (Refereed) Published
Abstract [en]

Balanced graph partitioning is an NP-complete problem with a wide range of applications. These applications include many large-scale distributed problems, including the optimal storage of large sets of graph-structured data over several hosts. However, in very large-scale distributed scenarios, state-of-the-art algorithms are not directly applicable because they typically involve frequent global operations over the entire graph. In this article, we propose a fully distributed algorithm called Ja-be-Ja that uses local search and simulated annealing techniques for two types of graph partitioning: edge-cut partitioning and vertex-cut partitioning. The algorithm is massively parallel: There is no central coordination, each vertex is processed independently, and only the direct neighbors of a vertex and a small subset of random vertices in the graph need to be known locally. Strict synchronization is not required. These features allow Ja-be-Ja to be easily adapted to any distributed graph-processing system from data centers to fully distributed networks. We show that the minimal edge-cut value empirically achieved by Ja-be-Ja is comparable to state-of-the-art centralized algorithms such as Metis. In particular, on large social networks, Ja-be-Ja outperforms Metis. We also show that Ja-be-Ja computes very low vertex-cuts, which are proved significantly more effective than edge-cuts for processing most real-world graphs.

Place, publisher, year, edition, pages
Association for Computing Machinery, 2015
Keywords
Distributed algorithm, Edge-cut partitioning, Graph partitioning, Load balancing, Simulated annealing, Vertex-cut partitioning, Algorithms, Computational complexity, Data handling, Digital storage, Network management, Parallel algorithms, Resource allocation, Balanced graph partitioning, Centralized algorithms, Edge cuts, Graph structured data, Simulated annealing techniques, State-of-the-art algorithms, Vertex-cut, Graph theory
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-41882 (URN)10.1145/2714568 (DOI)2-s2.0-84930973235 (Scopus ID)
Available from: 2019-12-12 Created: 2019-12-12 Last updated: 2023-06-07Bibliographically approved
Rahimian, F., Payberah, A. H., Girdzijauskas, S. & Haridi, S. (2014). Distributed vertex-cut partitioning. In: Lecture Notes in Computer Science: . Paper presented at 3 June 2014 through 5 June 2014, Berlin (pp. 186-200). Springer Verlag
Open this publication in new window or tab >>Distributed vertex-cut partitioning
2014 (English)In: Lecture Notes in Computer Science, Springer Verlag , 2014, p. 186-200Conference paper, Published paper (Refereed)
Abstract [en]

Graph processing has become an integral part of big data analytics. With the ever increasing size of the graphs, one needs to partition them into smaller clusters, which can be managed and processed more easily on multiple machines in a distributed fashion. While there exist numerous solutions for edge-cut partitioning of graphs, very little effort has been made for vertex-cut partitioning. This is in spite of the fact that vertex-cuts are proved significantly more effective than edge-cuts for processing most real world graphs. In this paper we present Ja-be-Ja-vc, a parallel and distributed algorithm for vertex-cut partitioning of large graphs. In a nutshell, Ja-be-Ja-vc is a local search algorithm that iteratively improves upon an initial random assignment of edges to partitions. We propose several heuristics for this optimization and study their impact on the final partitioning. Moreover, we employ simulated annealing technique to escape local optima. We evaluate our solution on various graphs and with variety of settings, and compare it against two state-of-the-art solutions. We show that Ja-be-Ja-vc outperforms the existing solutions in that it not only creates partitions of any requested size, but also requires a vertex-cut that is better than its counterparts and more than 70% better than random partitioning.

Place, publisher, year, edition, pages
Springer Verlag, 2014
Keywords
Big data, Graphic methods, Interoperability, Iterative methods, Simulated annealing, Data analytics, Graph processing, Local search algorithm, Multiple machine, Parallel and distributed algorithms, Random assignment, Real-world graphs, Simulated annealing techniques, Graph theory
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-45563 (URN)10.1007/978-3-662-43352-2_15 (DOI)2-s2.0-84902593727 (Scopus ID)9783662433515 (ISBN)
Conference
3 June 2014 through 5 June 2014, Berlin
Note

Conference code: 105614

Available from: 2020-08-10 Created: 2020-08-10 Last updated: 2023-06-07Bibliographically approved
Rahimian, F., Payberah, A., Girdzijauskas, S., Jelasity, M. & Haridi, S. (2013). Ja-be-Ja: A Distributed Algorithm for Balanced Graph Partitioning (7ed.). Kista, Sweden: Swedish Institute of Computer Science
Open this publication in new window or tab >>Ja-be-Ja: A Distributed Algorithm for Balanced Graph Partitioning
Show others...
2013 (English)Report (Other academic)
Abstract [en]

Balanced graph partitioning is a well known NP-complete problem with a wide range of applications. These applications include many large-scale distributed problems such as the optimal storage of large sets of graph-structured data over several hosts, or identifying clusters in on-line social networks. In such very large-scale distributed scenarios, state-of-the-art algorithms are not directly applicable, because they typically involve frequent global operations over the entire graph. In this paper, we propose a distributed graph partitioning algorithm, called Ja-be-Ja1. The algorithm is massively parallel: each graph node is processed independently, and only the direct neighbors of the node, and a small subset of random nodes in the graph need to be known. Strict synchronization is not required. These features allow Ja-be-Ja to be easily adapted to any distributed graph-processing system from data centers to fully distributed networks. We perform a thorough experimental analysis, which shows that the minimal edge-cut value achieved by Ja-be-Ja is comparable to state-of-the-art centralized algorithms such as Metis. In particular, on large social networks Ja-be-Ja outperforms Metis.

Place, publisher, year, edition, pages
Kista, Sweden: Swedish Institute of Computer Science, 2013 Edition: 7
Series
SICS Technical Report, ISSN 1100-3154 ; 2013:03
Keywords
graph partitioning, distributed algorithm, load balancing
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24165 (URN)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2023-06-07Bibliographically approved
Payberah, A. H., Kavalionak, H., Montresor, A., Dowling, J. & Haridi, S. (2013). Lightweight gossip-based distribution estimation. In: IEEE International Conference on Communications: . Paper presented at 2013 IEEE International Conference on Communications, ICC 2013, 9 June 2013 through 13 June 2013, Budapest (pp. 3439-3443). Institute of Electrical and Electronics Engineers Inc., Article ID 6655081.
Open this publication in new window or tab >>Lightweight gossip-based distribution estimation
Show others...
2013 (English)In: IEEE International Conference on Communications, Institute of Electrical and Electronics Engineers Inc. , 2013, p. 3439-3443, article id 6655081Conference paper, Published paper (Refereed)
Abstract [en]

Monitoring the global state of an overlay network is vital for the self-management of peer-to-peer (P2P) systems. Gossip-based algorithms are a well-known technique that can provide nodes locally with aggregated knowledge about the state of the overlay network. In this paper, we present a gossip-based protocol to estimate the global distribution of attribute values stored across a set of nodes in the system. Our algorithm estimates the distribution both efficiently and accurately. The key contribution of our algorithm is that it has substantially lower overhead than existing distribution estimation algorithms. We evaluated our system in simulation, and compared it against the state-of-the-art solutions. The results show similar accuracy to its counterparts, but with a communication overhead of an order of magnitude lower than them.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2013
Keywords
Algorithms, Distributed computer systems, Overlay networks, Peer to peer networks, Attribute values, Communication overheads, Distribution estimation, Distribution estimation algorithms, Global distribution, Gossip-based algorithms, Gossip-based protocol, Peer-to-Peer system, Estimation
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-47633 (URN)10.1109/ICC.2013.6655081 (DOI)2-s2.0-84891358815 (Scopus ID)9781467331227 (ISBN)
Conference
2013 IEEE International Conference on Communications, ICC 2013, 9 June 2013 through 13 June 2013, Budapest
Available from: 2020-08-28 Created: 2020-08-28 Last updated: 2023-06-07Bibliographically approved
Payberah, A. (2013). Live Streaming in P2P and Hybrid P2P-Cloud Environments for the Open Internet (7ed.). (Doctoral dissertation).
Open this publication in new window or tab >>Live Streaming in P2P and Hybrid P2P-Cloud Environments for the Open Internet
2013 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Peer-to-Peer (P2P) live media streaming is an emerging technology that reduces the barrier to stream live events over the Internet. However, providing a high quality media stream using P2P overlay networks is challenging and gives raise to a number of issues: (i) how to guarantee quality of the service (QoS) in the presence of dynamism, (ii) how to incentivize nodes to participate in media distribution, (iii) how to avoid bottlenecks in the overlay, and (iv) how to deal with nodes that reside behind Network Address Translators gateways (NATs). In this thesis, we answer the above research questions in form of new algorithms and systems. First of all, we address problems (i) and (ii) by presenting our P2P live media streaming solutions: Sepidar, which is a multiple-tree overlay, and GLive, which is a mesh overlay. In both models, nodes with higher upload bandwidth are positioned closer to the media source. This structure reduces the playback latency and increases the playback continuity at nodes, and also incentivizes the nodes to provide more upload bandwidth. We use a reputation model to improve participating nodes in media distribution in Sepidar and GLive. In both systems, nodes audit the behaviour of their directly connected nodes by getting feedback from other nodes. Nodes who upload more of the stream get a relatively higher reputation, and proportionally higher quality streams. To construct our streaming overlay, we present a distributed market model inspired by Bertsekas auction algorithm, although our model does not rely on a central server with global knowledge. In our model, each node has only partial information about the system. Nodes acquire knowledge of the system by sampling nodes using the Gradient overlay, where it facilitates the discovery of nodes with similar upload bandwidth. We address the bottlenecks problem, problem (iii), by presenting CLive that satisfies real-time constraints on delay between the generation of the stream and its actual delivery to users. We resolve this problem by borrowing some resources (helpers) from the cloud, upon need. In our approach, helpers are added on demand to the overlay, to increase the amount of total available bandwidth, thus increasing the probability of receiving the video on time. As the use of cloud resources costs money, we model the problem as the minimization of the economical cost, provided that a set of constraints on QoS is satisfied. Finally, we solve the NAT problem, problem (iv), by presenting two NAT-aware peer sampling services (PSS): Gozar and Croupier. Traditional gossip-based PSS breaks down, where a high percentage of nodes are behind NATs. We overcome this problem in Gozar using one-hop relaying to communicate with the nodes behind NATs. Croupier similarly implements a gossip-based PSS, but without the use of relaying.

Series
SICS dissertation series, ISSN 1101-1335
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24220 (URN)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-12-01Bibliographically approved
Jimenez, J., Baig, R., Escrich, P., Khan, A. M., Freitag, F., Navarro, L., . . . Vlassov, V. (2013). Supporting cloud deployment in the Guifi.net community network. In: Global Information Infrastructure Symposium, GIIS 2013: . Paper presented at 2013 Global Information Infrastructure Symposium, GIIS 2013; Trento; Italy; 28 October 2013 through 31 October 2013. , Article ID 6684361.
Open this publication in new window or tab >>Supporting cloud deployment in the Guifi.net community network
Show others...
2013 (English)In: Global Information Infrastructure Symposium, GIIS 2013, 2013, article id 6684361Conference paper, Published paper (Refereed)
Abstract [en]

Community networking is an emerging model of a shared communication infrastructure in which communities of citizens build and own open networks. Community networks offer successfully IP-based networking to the user. Cloud computing infrastructures however, while common in today's Internet, hardy exist in community networks. We explain our approach to bring clouds into the Guifi.net community network. For this we have started integrating part of our cloud prototype into the Guifi.net community network management tools. A proof-of-concept cloud infrastructure is currently under deployment in the Guifi.net community network. Our long term vision is that the users of community networks will not need to consume cloud applications from the Internet, but find them within the community network.

Keywords
cloud computing, community networks, Cloud applications, Cloud computing infrastructures, Cloud deployments, Cloud infrastructures, Communication infrastructure, IP-based networking, Proof of concept, Network management, Internet
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-48697 (URN)10.1109/GIIS.2013.6684361 (DOI)2-s2.0-84893215300 (Scopus ID)9781479929696 (ISBN)
Conference
2013 Global Information Infrastructure Symposium, GIIS 2013; Trento; Italy; 28 October 2013 through 31 October 2013
Available from: 2020-09-18 Created: 2020-09-18 Last updated: 2023-04-19Bibliographically approved
Payberah, A. H., Kavalionak, H., Kumaresan, V., Montresor, A. & Haridi, S. (2012). CLive: Cloud-assisted P2P live streaming. In: 2012 IEEE 12th International Conference on Peer-to-Peer Computing, P2P 2012: . Paper presented at 2012 IEEE 12th International Conference on Peer-to-Peer Computing, P2P 2012, 3 September 2012 through 5 September 2012, Tarragona (pp. 79-90). , Article ID 6335820.
Open this publication in new window or tab >>CLive: Cloud-assisted P2P live streaming
Show others...
2012 (English)In: 2012 IEEE 12th International Conference on Peer-to-Peer Computing, P2P 2012, 2012, p. 79-90, article id 6335820Conference paper, Published paper (Refereed)
Abstract [en]

Peer-to-peer (P2P) video streaming is an emerging technology that reduces the barrier to stream live events over the Internet. Unfortunately, satisfying soft real-time constraints on the delay between the generation of the stream and its actual delivery to users is still a challenging problem. Bottlenecks in the available upload bandwidth, both at the media source and inside the overlay network, may limit the quality of service (QoS) experienced by users. A potential solution for this problem is assisting the P2P streaming network by a cloud computing infrastructure to guarantee a minimum level of QoS. In such approach, rented cloud resources (helpers) are added on demand to the overlay, to increase the amount of total available bandwidth and the probability of receiving the video on time. Hence, the problem to be solved becomes minimizing the economical cost, provided that a set of constraints on QoS is satisfied. The main contribution of this paper is CLIVE, a cloud-assisted P2P live streaming system that demonstrates the feasibility of these ideas. CLIVE estimates the available capacity in the system through a gossip-based aggregation protocol and provisions the required resources from the cloud to guarantee a given level of QoS at low cost. We perform extensive simulations and evaluate CLIVE using large-scale experiments under dynamic realistic settings.

Keywords
Available bandwidth, Available capacity, Computing infrastructures, Economical Costs, Emerging technologies, Extensive simulations, Large scale experiments, Live streaming, Low costs, P2P streaming, Peer to peer, Potential solutions, Soft real time, Bandwidth, Cost benefit analysis, Internet protocols, Overlay networks, Peer to peer networks, Quality of service, Video streaming, Distributed computer systems
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-51032 (URN)10.1109/P2P.2012.6335820 (DOI)2-s2.0-84870380886 (Scopus ID)9781467328623 (ISBN)
Conference
2012 IEEE 12th International Conference on Peer-to-Peer Computing, P2P 2012, 3 September 2012 through 5 September 2012, Tarragona
Available from: 2021-01-11 Created: 2021-01-11 Last updated: 2023-06-07Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-2748-8929

Search in DiVA

Show all publications