What We Learned About Backend Development of Big Open Source Portals

Digital Design creates portal solutions that provide end-users with a wide range of customizable services within one website. It includes newsfeed, forums, data storage, statistics, corporate events, and others – that’s what we call portal. In this article we would like to explain how our team creates portal solutions: what we use and what are the pros and cons of it.

Basically, the idea of creating portals on our own had not arisen by chance. Using proprietary software has been limited in Russia for a while, and it affected big IT companies working with the public sector. Thanks to our experience in working with portals in external and internal environment of big companies in Russia and beyond, we are able to assess the best techniques and standardize many aspects.

For example, the Federal Technical Regulation and Metrology Agency’s website is publicly available and attracts many users. That is why we searched for and created solutions that would operate sustainably under high pressure. You need to keep in mind that many portal elements have to be indexed by search engines (it is not that easy and obvious for open-source portals), and we didn’t use ready-made solutions and CMS (as they would not meet clients’ requirements).

We are interested in development on our own because every portal service is a self-sufficient unit that can exist independently from other units. This helps us to develop, upgrade and scale them separately.

This article is a first part of the series: here we explain how everything looks from the backend side, what technologies we use, how the architecture is organized, what difficulties and advantages we found in our approach. We will elaborate on the frontend development aspect in next articles.

So, let’s begin.

Architecture

Our main goal was to develop a platform based on open source solutions that could be easily scaled and maintained.

This platform has two deployment options using the customer’s capacities:

1) The first option requires more capacities but enables easy scaling and maintains a set of systems simultaneously thanks to the microservice architecture. There are many ways to implement it, but still we have chosen Spring Cloud framework (Gateway, Discovery) together with Spring Boot (our Java-architect has chosen this technology stack as the most appropriate).

System architecture, Spring Cloud and Spring Boot

2) The second option is modular. It can be used if only one system or portal has to be maintained. It allows to deploy the whole infrastructure on one or two servers.

System architecture, modular

Technologies Used

Open JDK 8

Wildfly. The application runtime.

PostgreSQL. DBMS. All modules are based on ORM technologies, can be easily migrated to another DBMS.

ExoPlatform. A portal platform that can combine various modules in a unified web application, manage their allocation on a webpage, administrate access rights and more. The Community version can be downloaded on the official website but only for Tomcat application server. The source code can be adapted to Wildfly manually.

Modules or Portlets exist in a portlet container and draw their own area on the page. Portlets are developed according to the JSR specification (JSR 362).

Spring Framework. The basis for all apps developed by Digital Design in Java. This framework has many modules for many kinds of tasks (Spring MVC, Spring Mail, Spring Data (JPA), Spring JMS and more).

Spring Cloud. The basis for microservices development (Discovery, Gateway, Ribbon, Config Server).

Keycloak. Authorization and access management server (IDM/IAM). It supports several authorization protocols (OpenId, SAML, Oauth, Kerberos) which allows to manage authorization strategies, set access rights and keep the system’s resources access safe.

Kurento. A media server used for video/audio broadcasts between users, based on the WebRTC protocol.

Elasticsearch. A full text search server. We developed our own crawler to search through a portal and collect portal data for indexation, and a parser to extract file contents.

Swagger. A suite of API developer tools to auto-generate documentation and connect to the Rest-API.

JaspeRepors. A framework for reports generation. JasperStudio is used to create report templates.

Graylog. A server for collecting logs from applications.

All modules are assembled by Maven and CI\CD (Jenkins).

Microservices

A list of microservices to maintain the modules:

Files management service. A microservice that allows storing data in a unified storage, analyzing the content, converting files meant to display on the web interface. Also, it is used to download and link files to other systems, co-edit files and support their versions.

Full-text search service. A microservice to search through systems and modules. It indexes objects and system files using various analyzers pre-configured in Elasticseach. For files indexing, Apache Tika parser is used. It can recognize various types of documents and pull out their contents.

Communication service. A microservice used to exchange messages between users based on web-sockets and to perform video/audio broadcasts over the WebRTC Protocol using the Kurento media server.

Statistics server. A microservice used to get statistical data from system modules and generate relevant reports.

Parameters/configurations service. A microservice for a centralized storage of module settings and even personal settings.

Organizational structure service. A microservice used to pull out information about employees and departments from various sources, aggregate it into a single storage and transfer it to end modules.

Business process service. A microservice that allows executing complex workflows for specific tasks based on the BPMN specification. JBPM was chosen as the engine.

Modules

WCM. For managing formatted content on portal pages.

Document library. For managing, storing, editing, and supporting versions of documents.

Newsfeed. A company news aggregator.

Forum. A classic full-featured forum.

Events calendar. A calendar of events and activities.

Chat. A classic chat with moderation and rooms.

Streaming. For streaming events for all users with screen sharing and moderation.

Media library. A library for media content. Streaming videos, comments, likes, and other delights.

Organizational structure. For handbooks and reports.

Administrative module. For managing access rights to nodes, pages, portraits, and other portal data. It changes the view of portlets, pages, and virtual portals.

Navigation. For creating custom navigation for the portal and portal nodes.

Surveys and tests. For running tests and surveys of various types (from a single choice to loop questions and differentials). Created for testing, training, and conducting surveys for portal users.

Additional modules can be added according to specific clients’ business processes (about 30 in total).

Difficulties and Advantages

Advantages of this solution:

  • Thanks to its architecture, the system can be deployed on single and multiple servers to enable more flexible load balancing configuration.
  • All modules are developed independently, which makes it safer by not affecting other functionality and filling the portal with new features.
  • Each module has its own database via JNDI, and its connection to databases can be managed separately. This allows to configure each connection more flexibly.
  • A frequently used set of functions is managed by microservices, which saves time to develop new modules.
  • New virtual portals can be developed with customized settings, themes, and access rights.
  • A unified authentication server allows adding new systems and configure authorization rules.
  • All modules are developed according to generally accepted specifications. It allows to deploy open source modules developed by third-party developers.
  • Customizability of portal modules allows to regulate what modules have to be on.
  • It is based entirely on open source technologies, which makes it easier to update and develop it further.

Problems that we faced:

  • Maintaining the backward compatibility of microservices’ and modules’ APIs. Now it can only be solved by running integration tests and figuring requirements.
  • Managing access rights to microservices resources. Introducing ABAC technologies is planned.
  • Formalizing logs. It is planned to develop a wrapper library for logging in a required format.
  • Centralized data caching. It is planned to implement a centralized cache-server. It is necessary for centralized collection of logs of dev test prod stands to track the stability and precision of our ecosystem.

Finally

This set of technologies and a chosen solution architecture allow us to develop portals, intranets and external solutions quickly and efficiently. We can apply different settings and use the best practices.