Tuesday, February 26, 2019

Industry Practices and Tools 2

Importance of maintaining the quality of the code, explaining the different aspects of the code quality

Readability- refers to how well you can read the text and how well the layout has been made.It's very important when your programming code is readable, this is because when you or someone else is reviewing the code it is best if it's comprehensible so any errors that may appear can seen noticed easier, and also, if the code is cluttered then looking through it for errors or if you want to make changes will take much longer.

Robustness- refers to how tough the data is and how well it can deal with various inputs and if it can detect and mitigate errors well. This is very important as you as a programmer don't want data that easily fails apart and you also want one that can make your job easier and detect and handle data in a dataset.

Portability- refers to how well the data can be used in various OS computers and environments.This aspect is very important as you may be working on something that has to be used by another company that may or may not be in another country, so you have to make sure that another OS, computer and company can understand and use your code.

Maintainability- refers to how easy it is to maintain the code. This is also highly important as you will encounter issues throughout programming since it's part of the process of making good code, but once you're done bugs and errors will pop up that you may or may not have seen before, having easily maintainable data means that you can just go in and quickly find out the issue and resolve it, this also refers to how easy it is to edit the code and process too.

Different approaches and measurements used to measure the quality of code

1. Use a Coding Standard

Using a coding standard is one of the best ways to ensure high quality code.
A coding standard makes sure everyone uses the right style. It improves consistency and readability of the codebase. This is key for lower complexity and higher quality.

2. Analyze Code — Before Code Reviews
Quality should be a priority from the very start of development. There isn’t always the luxury of time as development progresses. That’s why it’s important to analyze code before code reviews begin. And it’s best to analyze code as soon as it’s written.
In DevOps, code analysis takes place during the create phase. Static analyzers can be run over code as soon as it’s written. This creates an automated feedback loop, so developers can improve code before it goes to the code review phase.
After all, the earlier you find errors, the faster, easier, and cheaper they are to resolve.

3. Follow Code Review Best Practices

Manual code reviews are still important for verifying the intent of the code. When code reviews are done well, they improve overall software quality.

4. Refactor Legacy Code (When Necessary)

One way to improve the quality of an existing codebase is through refactoring. Refactoring legacy code can help you clean up your codebase and lower its complexity.

Identify and compare some available tools to maintain the code quality

1) Collaborator

Collaborator tool
Collaborator is the most comprehensive peer code review tool, built for teams working on projects where code quality is critical.

  • See code changes, identify defects, and make comments on specific lines. Set review rules and automatic notifications to ensure that reviews are completed on time.
  • Custom review templates are unique to Collaborator. Set custom fields, checklists, and participant groups to tailor peer reviews to your team’s ideal workflow.
  • Easily integrate with 11 different SCMs, as well as IDEs like Eclipse & Visual Studio
  • Build custom review reports to drive process improvement and make auditing easy.
  • Conduct peer document reviews in the same tool so that teams can easily align on requirements, design changes, and compliance burdens.

2) Review Assistant

review-assistant tool logo
Review Assistant is a code review tool. This code review plug-in helps you to create review requests and respond to them without leaving Visual Studio. Review Assistant supports TFS, Subversion, Git, Mercurial, and Perforce. Simple setup: up and running in 5 minutes.
Key features:
  • Flexible code reviews
  • Discussions in code
  • Iterative review with defect fixing
  • Team Foundation Server integration
  • Flexible email notifications
  • Rich integration features
  • Reporting and Statistics
  • Drop-in Replacement for Visual Studio Code Review Feature and much more

3) Codebrag

Codebrag Logo
  • Codebrag is a simple, light-weight, free and open source code review tool which makes the review entertaining and structured.
  • Codebrag is used to solve issues like non-blocking code review, inline comments & likes, smart email notifications etc.
  • With Codebrag one can focus on workflow to find out and eliminate issues along with joint learning and teamwork.
  • Codebrag helps in delivering enhanced software using its agile code review.
  • License for Codebrag open source is maintained by AGPL.

4) Gerrit

Gerrit Logo
  • Gerrit is a free web-based code review tool used by the software developers to review their code on a web-browser and reject or approve the changes.
  • Gerrit can be integrated with Git which is a distributed Version Control System.
  • Gerrit provides the repository management for Git.
  • Using Gerrit, project members can use rationalized code review process and also the extremely configurable hierarchy.
  • Gerrit is also used in discussing a few detailed segments of the code and enhancing the right changes to be made.

5) Codestriker

Codestriker Logo
  • Codestriker is an open source and free online code reviewing web application that assists the collaborative code review.
  • Using Codestriker one can record the issues, comments, and decisions in a database which can be further used for code inspections.
  • Codestriker supports traditional documents review. It can be integrated with ClearCase, Bugzilla, CVS etc.
  • Codestriker is licensed under GPL.

Need for dependency/package management tools in software development

package manager or package management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system in a consistent manner.
A package manager deals with packages, distributions of software and data in archive files. Packages contain metadata, such as the software's name, description of its purpose, version number, vendor, checksum, and a list of dependenciesnecessary for the software to run properly. Upon installation, metadata is stored in a local package database. Package managers typically maintain a database of software dependencies and version information to prevent software mismatches and missing prerequisites. They work closely with software repositoriesbinary repository managers, and app stores.
Package managers are designed to eliminate the need for manual installs and updates. This can be particularly useful for large enterprises whose operating systems are based on Linux and other Unix-likesystems, typically consisting of hundreds or even tens of thousands of distinct software packages.

Role of dependency/package management tools in software development

Dependency management tools move the responsibility of managing third-party libraries from the code repository to the automated build. Typically dependency management tools use a single file to declare all library dependencies, making it much easier to see all libraries and their versions at once.

What is a build tool?

Build tools are programs that automate the creation of executable applications from source code. Building incorporates compiling, linking and packaging the code into a usable or executable form. ... Using an automation tool allows the build process to be more consistent.

Explain the role of build automation

Build automationBuild automation is the process of automating the creation of a software build and the associated processes including: compiling computer source code into binary code, packaging binary code, and running automated tests.

Different build tools used in industry

Invoke

Website • Wikipedia
Invoke is a Python (2.6+ and 3.3+) task execution tool & library, drawing inspiration from various sources to arrive at a powerful & clean feature set. 

Open Build Service

Website • Wikipedia
The Open Build Service (OBS) is a generic system to build and distribute packages from sources in an automatic, consistent and reproducible way.

Webpack

Website • Wikipedia
Webpack is a module bundler for modern JavaScript applications.

Buildr

Website • Wikipedia
Buildr is an open-source build system mainly intended to build Java applications, but capable of doing much more. 

Maven

Website • Wikipedia
Maven is a build automation tool used primarily for Java projects. The word maven means 'accumulator of knowledge' in Yiddish.

MSBuild

Website • Wikipedia
MSBuild, also called Microsoft Build Engine, is a build tool for managed code and was part of .NET Framework.

Explain the build life cycle, using an example

Maven is based around the central concept of a build lifecycle. What this means is that the process for building and distributing a particular artifact (project) is clearly defined.
For the person building a project, this means that it is only necessary to learn a small set of commands to build any Maven project, and the POM will ensure they get the results they desired.
There are three built-in build lifecycles: default, clean and site. The default lifecycle handles your project deployment, the clean lifecycle handles project cleaning, while the site lifecycle handles the creation of your project's site documentation.
Each of these build lifecycles is defined by a different list of build phases, wherein a build phase represents a stage in the lifecycle.
  • validate - validate the project is correct and all necessary information is available
  • compile - compile the source code of the project
  • test - test the compiled source code using a suitable unit testing framework. These tests should not require the code be packaged or deployed
  • package - take the compiled code and package it in its distributable format, such as a JAR.
  • verify - run any checks on results of integration tests to ensure quality criteria are met
  • install - install the package into the local repository, for use as a dependency in other projects locally
  • deploy - done in the build environment, copies the final package to the remote repository for sharing with other developers and projects.

What is Maven, a dependency/package management tool or a build tool or something more?

Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.

How Maven uses conventions over configurations

Maven uses Convention over Configuration, which means developers are not required to create build process themselves.
Developers do not have to mention each and every configuration detail. Maven provides sensible default behavior for projects. When a Maven project is created, Maven creates default project structure. Developer is only required to place files accordingly and he/she need not to define any configuration in pom.xml.

Discuss the terms build phases, build life cycle, build profile, and build goal in Maven

A Build Lifecycle is a well-defined sequence of phases, which define the order in which the goals are to be executed. Here phase represents a stage in life cycle. As an example, a typical Maven Build Lifecycle consists of the following sequence of phases.
PhaseHandlesDescription
prepare-resourcesresource copyingResource copying can be customized in this phase.
validateValidating the informationValidates if the project is correct and if all necessary information is available.
compilecompilationSource code compilation is done in this phase.
TestTestingTests the compiled source code suitable for testing framework.
packagepackagingThis phase creates the JAR/WAR package as mentioned in the packaging in POM.xml.
installinstallationThis phase installs the package in local/remote maven repository.
DeployDeployingCopies the final package to the remote repository.
There are always pre and post phases to register goals, which must run prior to, or after a particular phase.
When Maven starts building a project, it steps through a defined sequence of phases and executes goals, which are registered with each phase.
Maven has the following three standard lifecycles −
  • clean
  • default(or build)
  • site
goal represents a specific task which contributes to the building and managing of a project. It may be bound to zero or more build phases. A goal not bound to any build phase could be executed outside of the build lifecycle by direct invocation.
A Build profile is a set of configuration values, which can be used to set or override default values of Maven build. Using a build profile, you can customize build for different environments such as Production v/s Development environments.
Profiles are specified in pom.xml file using its activeProfiles/profiles elements and are triggered in variety of ways. Profiles modify the POM at build time, and are used to give parameters different target environments (for example, the path of the database server in the development, testing, and production environments).

How Maven manages dependency/packages and build life cycle

Use of the repositories
•Maven uses two types of repositories: local and
remote
• Local – inside the computer
• Remote
• Internal – Within the company
• External – Via internet, from the original repo

Best Practice - Using a Repository Manager
•A repository manager is a dedicated server application designed to manage repositories of binary components. The usage of a repository manager is considered an essential best practice for any significant usage of Maven.
Use the POM
•Project Object Model (POM) is an XML representation of a Maven project held in a file
named pom.xml
•Contains the configurations of the project
•developers involved and the roles they play
• the defect tracking system
• the organization and licenses
• the URL of where the project lives
• the project's dependencies
•Etc…

Discuss some other contemporary tools and practices widely used in the software industry

1. NPM

Each project can use a package.json file setup through NPM and even managed with Gulp(on Node). Dependencies can be updated and optimized right from the terminal. And you can build new projects with dependency files and version numbers automatically pulled from the package.json file.
NPM is valuable for more than just dependency management, and it’s practically a must-know tool for modern web development. If you’re confused please check out this Reddit thread for a beginner’s explanation.

2. Bower

The package management system Bower runs on NPM which seems a little redundant but there is a difference between the two, notably that NPM offers more features while Boweraims for a reduction in filesize and load times for frontend dependencies.

3. RubyGems

RubyGems is a package manager for Ruby with a high popularity among web developers. The project is open source and inclusive of all free Ruby gems.
To give a brief overview for beginners, a “gem” is just some code that runs on a Ruby environment. This can lead to programs like Bundler which manage gem versions and keep everything updated.

4. RequireJS

There’s something special about RequireJS in that it’s primarily a JS toolset. It can be used for loading JS modules quickly including Node modules.
RequireJS can automatically detect required dependencies based on what you’re using so this might be akin to classic software programming in C/C++ where libraries are included with further libraries.

5. Jam

Browser-based package management comes in a new form with JamJS. This is a JavaScript package manager with automatic management similar to RequireJS.
All your dependencies are pulled into a single JS file which lets you add and removeitems quickly. Plus these can be updated in the browser regardless of other tools you’re using (like RequireJS).

6. Browserify

Most developers know of Browserify even if it’s not part of their typical workflow. This is another dependency management tool which optimizes required modules and libraries by bundling them together.
These bundles are supported in the browser which means you can include and merge modules with plain JavaScript. All you need is NPM to get started and then Browserify to get moving.

7. Mantri

Still in its early stages of growth, MantriJS is a dependency system for mid-to-high level web applications. Dependencies are managed through namespaces and organized functionally to avoid collisions and reduce clutter.

8. Volo

The project management tool volo is an open source NPM repo that can create projects, add libraries, and automate workflows.
Volo runs inside Node and relies on JavaScript for project management. A brief intro guide can be found on GitHub explaining the installation process and common usage. For example if you run the command volo create you can affix any library like HTML5 Boilerplate.

9. Ender




Ender is the “no-library library” and is one of the lightest package managers you’ll find online. It allows devs to search through JS packages and install/compile them right from the command line. Ender is thought of as “NPM’s little sister” by the dev team.

10. pip

The recommended method for installing Python dependencies is through pip. This tool was created by the Python Packaging Authority and it’s completely open source just like Python itself.


Friday, February 22, 2019

Industry Practices and Tools 1

Need for VCS

Version control systems are a category of software tools that help a software team to manage changes to source code over time. Version control software keeps track of every modifications to the source in a special kind of database.

  • Collaboration
*With a VCS, everybody on the team is able to work absolutely freely - on any file at any time
*The VCS will later allow you to merge all the changes into a common version

  • Storing versions properly
*A version control system acknowledges  that there is only one project

  • Restoring previous versions  
  • Understanding what happened

*Every time you save a new version of your project, your VCS requires you to provide a short description of what was changed

  • Backup

Differentiate the three models of VCSs, stating their pros and cons

Local version control systems

  • Oldest VCS
  • Everything is in your Computer
  • Cannot be used for collaborative software development

Centralized version control systems

  • Can be used for collaborative software development
  • Everyone knows to a certain degree what others on the project are doing
  • Administrators have fine grained control over who can do what
  • Most obvious is the single point of failure that the centralized server represents

Distributed version control systems

  • No single point of failure
  • Clients don't just check out the latest snapshot of the files: they fully mirror the repository
  • If any server dies, and these systems were collaborating via it, any of the client repositories can be copied back
  • Can collaborate with different groups of people in different ways simultaneously within the same project

Git and GitHub, are they same or different? Discuss with facts

Git is a revision control system, a tool to manage your source code history. GitHub is a hosting service for Git repositories. So they are not the same thing: Git is the tool, Github is the service for projects that use Git.


Compare and contrast the Git commands, commit and push

Since git is a distributed version control system, thedifference is that commit will commit changes to your local repository, whereas push will push changes up to a remote repo. git commit record your changes to the local repository. git push update the remote repository with your local changes.

Discuss the use of staging area and Git directory


Git has three main states that your files can reside in: committed, modified, and staged. Committed means that the data is safely stored in your local database. Modified means that you have changed the file but have not committed it to your database yet. Staged means that you have marked a modified file in its current version to go into your next commit snapshot.

This leads us to the three main sections of a Git project: the Git directory, the working directory, and the staging area.
The Git directory is where Git stores the metadata and object database for your project. This is the most important part of Git, and it is what is copied when you clone a repository from another computer.
The working directory is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify.
The staging area is a simple file, generally contained in your Git directory, that stores information about what will go into your next commit. It’s sometimes referred to as the index, but it’s becoming standard to refer to it as the staging area.
The basic Git workflow goes something like this:
  1. You modify files in your working directory.
  2. You stage the files, adding snapshots of them to your staging area.
  3. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.
If a particular version of a file is in the Git directory, it’s considered committed. If it’s modified but has been added to the staging area, it is staged. And if it was changed since it was checked out but has not been staged, it is modified. In Chapter 2, you’ll learn more about these states and how you can either take advantage of them or skip the staged part entirely.

Explain the collaboration workflow of Git, with example


The central repository represents the official project, so its commit history should be treated as sacred and immutable. If a developer’s local commits diverge from the central repository, Git will refuse to push their changes because this would overwrite official commits.
Before the developer can publish their feature, they need to fetch the updated central commits and rebase their changes on top of them. This is like saying, “I want to add my changes to what everyone else has already done.” The result is a perfectly linear history, just like in traditional SVN workflows.
If local changes directly conflict with upstream commits, Git will pause the rebasing process and give you a chance to manually resolve the conflicts. The nice thing about Git is that it uses the same git status and git add commands for both generating commits and resolving merge conflicts. This makes it easy for new developers to manage their own merges. Plus, if they get themselves into trouble, Git makes it very easy to abort the entire rebase and try again (or go find help).
Example
Let’s take a general example at how a typical small team would collaborate using this workflow. We’ll see how two developers, John and Mary, can work on separate features and share their contributions via a centralized repository.

Discuss the benefits of CDNs


 Improving website load times - By distributing content closer to
website visitors by using a nearby CDN server (among other
optimizations), visitors experience faster page loading times. As visitors
are more inclined to click away from a slow-loading site, a CDN can
reduce bounce rates and increase the amount of time that people spend
on the site. In other words, a faster a website means more visitors will
stay and stick around longer.

• Reducing bandwidth costs - Bandwidth consumption costs for website
hosting is a primary expense for websites. Through caching and other
optimizations, CDNs are able to reduce the amount of data an origin
server must provide, thus reducing hosting costs for website owners.

• Increasing content availability and redundancy - Large amounts of
traffic or hardware failures can interrupt normal website function. Thanks
to their distributed nature, a CDN can handle more traffic and withstand
hardware failure better than many origin servers.

• Improving website security - A CDN may improve security by
providing DDoS mitigation, improvements to security certificates, and
other optimizations.

Differences Between CDNs and Web Hosting

  1. Web Hosting is used to host your website on a server and let users access it over the internet. A content delivery network is about speeding up the access/delivery of your website’s assets to those users.
  2. Traditional web hosting would deliver 100% of your content to the user. If they are located across the world, the user still must wait for the data to be retrieved from where your web server is located. A CDN takes a majority of your static and dynamic content and serves it from across the globe, decreasing download times. Most times, the closer the CDN server is to the web visitor, the faster assets will load for them.
  3. Web Hosting normally refers to one server. A content delivery network refers to a global network of edge servers which distributes your content from a multi-host environment.

Identify free and commercial CDNs

free-
1. CloudFlare
2. Incapsula
3. Photon by Jetpack
4. Swarmify
Commercial- 
1.Google Cloud CDN
2.AWS Cloudfront 
3.Cloudinary
4.Imgur
5.Microsoft Azure CDN

Discuss the requirements for virtualization

1. Hardware virtualization
• VMs, emulators

2. OS level virtualization (Desktop virtualization)
• Remote desktop terminals

3. Application level virtualization
• Runtimes (JRE/JVM, .NET), engines (games engines)

4. Containerization (also OS/application level)
•Dockers

5. Other virtualization types
•Database, network, storage, etc.

Discuss and compare the pros and cons of different virtualization techniques in different
levels

Pros-
  • Using Virtualization for Efficient Hardware Utilization
  • Using Virtualization to Increase Availability
  • Disaster Recovery
  • Save Energy
  • Deploying Servers too fast
  • Save Space in your Server Room or Datacenter
  • Testing and setting up Lab Environment
  • Shifting all your Local Infrastructure to Cloud in a day
  • Possibility to Divide Services
Cons-
  • Extra Costs
  • Software Licensing
  • Learn the new Infrastructure

Identify popular implementations and available tools for each level of virtualization

Tools

1. Virtual Network User Mode Linux (VNUML)
VNUML (Barham et al. 2003) is an open source and is available to all the users for free download. VNUML is basically a virtualization tool used for multiple virtual systems of Linux operating system. These virtual systems are known as guests which run their applications along with Linux operating system of the original system which is refer to as host.

2. Virtual Box
 Virtual Box is used for implementation of virtual
machines on the physical computers and servers.  It also does full virtualization in the host computer which means that without any modification in the operating system the guest operating system is executed on the host computer (Geiselhart et al. 2003).

3. VMware Server
It is a source free virtualization tool for Linux as well as Windows operating system (Cox 2007). VMware Server is based on the full virtualization i.e., the physical desktop computer to run more than one virtual machine of varying operating system called guests on it.

4. Qemu
Qemu is used for execution of virtualization in the
operating systems like Linux and Windows both. It is a popular open source (R.&M. 2007) emulator that provides fast emulation by the help of dynamic translation. It has many useful commands for the management of VM.

5. Xen
Xen  is also an open source tool for virtualization used widely for Para virtualization in the host PC and guest computers (Bavier et al. 2006).

6.   VMware
VMware is a VM (virtual machine) platform which helps in execution of unmodified operating system on the host or user level application hardware . Operating system which is being executed with VMware may get crashed, reinstalled, rebooted or crashed without any effect on the application running on the host computer.
VMware gives the separation of guest operating system from the real host operating system so that if the guest operating system fails then the physical hardware or the host machine does not suffer from the consequences (Fuertes & de Vergara 2007).
VMware is used to produce an illusion of standard
Personal Computer hardware inside the virtual machine. Therefore the VMware is used to execute several unmodified operating systems at the same time on the single hardware machine by executing the operating system in the virtual machine of specific operating system. Instead of indirect running of code on the hardware as in
the case of software simulator, virtual machine executes the code directly on the physical hardware without any application for the interpretation of code.

7. EMF Tool
EMF virtualization tool is an eclipse based plug in on
EMF basis to hold the transparent usage of virtual models all of which are based on EMF. For the creation of a virtual model using the EMF tool, the users have to provide contributing models along with Meta models for the virtualization. 


What is the hypervisor and what is the role of it?

hypervisor or virtual machine monitor (VMM) is computer softwarefirmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called a host machine, and each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Multiple instances of a variety of operating systems may share the virtualized hardware resources: for example, LinuxWindows, and macOS instances can all run on a single physical x86machine. This contrasts with operating-system-level virtualization, where all instances (usually called containers) must share a single kernel, though the guest operating systems can differ in user space, such as different Linux distributions with the same kernel.


How does the emulation is different from VMs?

The purpose of a virtual machine is to create an isolated environment.
The purpose of an emulator is to accurately reproduce the behavior of some hardware.
Both aim for some level of independence from the hardware of the host machine, but a virtual machine tends to simulate just enough hardware to make the guest work, and do so with an emphasis on efficiency of the emulation/virtualization. Ultimately the virtual machine may not act like any hardware that really exists, and may need VM-specific drivers, but the set of guest drivers will be consistent across a large number of virtual environments.
An emulator on the other hand tries to exactly reproduce all the behavior, including quirks and bugs, of some real hardware being simulated. Required guest drivers will exactly match the environment being simulated.
Virtualization, paravirtualization, and emulation technology, or some combination may be used for the implementation of virtual machines. Emulators generally can't use virtualization, because that would make the abstraction somewhat leaky.

Compare and contrast the VMs and containers/dockers, indicating their advantages and disadvantages

A virtual machine (VM) is an emulation of a computer system. Put simply, it makes it possible to run what appear to be many separate computers on hardware that is actually one computer.

With containers, instead of virtualizing the underlying computer like a virtual machine (VM), just the OS is virtualized.

RICH WEB BASED APPLICATIONS

Term “Rich Internet Applications” (RIAs) from “Rich Web-based Applications” (RiWAs). A rich Internet application (RIA) is a Web applicati...