Monday, December 22, 2014

End of the Year Holidays

The MOSEK staff will take a break during Christmas and New Year Holidays.  

Customer support will be reduced during the period 22/12/2014 to 04/01/2015

     and suspended on

December 24,25,26  2014 and  January 1st 2015



We wish you a merry Christmas and an Happy new year!



Tuesday, December 16, 2014

MOSEK 7.1 released!

We are happy to announce that MOSEK version 7.1 has been released!


After several weeks as beta version, MOSEK 7.1 has now be promoted as the new stable version we provide to MOSEK users. Some of the new features are 

  • Updated mixed-integer conic optimizer with improved performance.
  • Updated MPS reader with support for the CPLEX format
  • The default MPS format has been changed to the free MPS format.
  • Experimental support for reading the conic benchmark format. Semidefinite constraints and variables are not supported yet.
  • A new experimental feature for reformulating certain quadratically constrained into a conic problem.
  • In Fusion it is now possible to obtain optimizer statistics e.g. mixed integer gap etc.
  • Added some of the BLAS and LAPACK functionality e.g. gemm and potrf functions.

Grab your copy from our website


Licenses for MOSEK 7 are still valid. And remember you can easy obtain an academic or commercial trial license with few clicks!


Wednesday, November 19, 2014

The reason for the 2 in the definition of the rotated quadratic cone

MOSEK employs the definition
\begin{equation}
K^1 := \{ x \mid 2 x_1 x_2 \geq ||x_{3:n}||^2,  x_1, x_2 \geq 0 \}
\end{equation}
for the rotated quadratic cone, Occasionally users of MOSEK ask why there is the 2 in front of the product $x_1 x_2$.  Why not use the definition
\begin{equation}
K^2 := \{ x \mid  x_1 x_2 \geq ||x_{3:n}||^2,  x_1, x_2 \geq 0 \} ?
\end{equation}

The reason is that the dual cone plays an important role and the dual cone of $K^1$ is $K^1$ i.e. it is self-dual. That is pretty! Now the dual cone of $K^2$ is
\begin{equation}
\{ x \mid 4 s_1 s_2 \geq ||s_{3:n}||^2,  s_1, s_2 \geq 0 \}.
\end{equation}
Hence, $K^2$ is not self-dual! That is somewhat ugly and inconvenient. 

To summarize the definition $K^1$ for the rotated quadratic cone is preferred because the alternative definition $K^2$ is not self-dual

A couple of historical notes are:

  • In the classical paper  by Lobo et. al.  the cone $K^2$ is called a hyperbolic constraint in Section 2.3.  
  • MOSEK is highly inspired by the important work of the late Jos Sturm on the code SeDuMi . Now SeDuMi is short for self-dual minimization and for that reason Sturm employs the definition $K^1$ too. 



Tuesday, October 28, 2014

MOSEK version 7.1 BETA available!

We are glad to announce that a beta version of the upcoming release 7.1 of our software is now available for download.

The main improvements and new features are:

* Improved performance of the mixed-integer conic optimizer

* Updated MPS reader with support for the CPLEX specific sections QMATRIX and QCMATRIX.

* The default MPS format has been changed to the free MPS format.

* Experimental support for reading the conic benchmark format. Semidefinite constraints and
variables are not supported yet.

* A new experimental feature for reformulating certain quadratically constrained into a conic
problem.

* In Fusion it is now possible to obtain optimizer statistics e.g. mixed integer gap etc.

* Access to some BLAS/LAPACK routines directly from the solver API.


To obtain a copy please visit


and select the relevant platform and operating system you like. We hope to have feed back from you!

Note that the beta version needs a valid license for MOSEK 7, trial and academic license included.


The MOSEK Team

Monday, October 20, 2014

2015 ISMP Symposium sponsorship

We are glad to announce that MOSEK will sponsor the


that will take place in Pittsburgh, PA, USA, on July 12-17, 2015.

We believe it will be an exciting event and we look forward to join the meeting!


Tuesday, October 7, 2014

Slides from our workshop on SDP preprocessing.

We would like to thank to our guest for the nice workshop held yesterday here at our site. Etienne, Frank and Joachim have given three nice and interesting talks on cutting edge technique for the preprocessing of semidefinite programs. 

Click on the links below to download the slides:

Title: Exploiting special structure in semidefinite programs
Speaker: Etienne De Klerk, Tilburg University (NL)


Title: Partial facial reduction: simplified, equivalent SDPs via approximations of the PSD cone
Speaker: Frank Permenter, MIT (US)

Speaker: Joachim Dahl, MOSEK ApS (DK)



We hope to see all of you soon, possibly at the INFORMS meeting in November!

Thursday, September 4, 2014

Seminar on "Preprocessing semidefinite optimization problems"

We are please to announce that on October the 6th 2014 a series of seminars titled "Preprocessing semidefinite optimization problems" will be held at MOSEK site in the Symbion research park

Three prominent experts of the field, namely E. De Klerk, F. Permenter and J. Dahl, will discuss recent advances and applications on semidefinite optimization problem preprocessing.

Participation is free and everybody is welcome, but for logistic reasons we would like you to register here. Refreshment will be served. Detailed titles and abstracts can be found at the end of the page.

Schedule of the day:

15:00 - 15:05 Introduction
15:05 - 15:50 E. De Klerk,
 Tilburg University (NL)
15:50 - 15:55 Break
15:55 - 16:40 F. Permenter, MIT (US)
16:40 - 16:45 Break
16:45 - 17:15 J. Dahl, MOSEK ApS (DK)
18:00 Optional networking at Nørrebro Bryghus.


We hope to see you!

Contact: info@mosek.com
How to get there: check information on the Symbion website

How to register: fill the form below or click here






-----------------------------------------------------------------------------------------



Title: Exploiting special structure in semidefinite programs
Speaker: Etienne De Klerk, Tilburg University (NL)

Abstract: Semidefinite Programming (SDP) may be seen as a generalization of Linear Programming (LP). In particular, one may extend interior point algorithms for LP to SDP, but it has proven much more difficult to exploit structure in the SDP data during computation. We survey three types of special structure in SDP data:
  • a common `chordal' sparsity pattern of all the data matrices. This structure arises in applications in graph theory, and may also be used to deal with more general sparsity patterns in a heuristic way.
  • low rank of all the data matrices. This structure is common in SDP relaxations of combinatorial optimization problems, and SDP approximations of polynomial optimization problems.
  • the situation where the data matrices are invariant under the action of a permutation group, or, more generally, where the data matrices belong to a low dimensional matrix algebra. Such problems arise in truss topology optimization, particle physics, coding theory, computational geometry, and graph theory.


Title: Partial facial reduction: simplified, equivalent SDPs via approximations of the PSD cone
Speaker: Frank Permenter, MIT (US)

Abstract: We develop a practical semidefinite programming (SDP) facial reduction procedure that utilizes computationally efficient approximations of the positive semidefinite cone. The proposed method simplifies SDPs with no strictly feasible solution
by solving a sequence of easier optimization problems and could be a useful pre-processing step for SDP solvers. We demonstrate effectiveness of the method on SDPs arising in practice, and describe our publicly-available software implementation.
We also give a post-processing procedure for dual solution recovery that generally applies to facial-reduction-based pre-processing techniques. Joint work with Pablo Parrilo.

Title: Solving the pooling problem using semidefinite programming.
Speaker: Joachim Dahl, MOSEK ApS (DK)

Abstract: The pooling problem is a well-studied difficult nonlinear flow-problem with important applications in the oil- and refinery industry. In this talk we review how to solve this difficult non-convex problem using the Lasserre hierarchy of semidefinite relaxations, and we demonstrate how chordal structure can be exploited to solve larger instances.

Tuesday, August 5, 2014

License server hosted by Amazon Web Service

Hosting services in the cloud is more and more popular. As prices fall and processes simplify, a growing number of companies consider moving some services on dedicated machines provided  by cloud services such as Amazon Web Service (AWS).

Some of our customers have recently considered running the license server used to manage MOSEK licenses to a virtual machine provided by AWS. We think it may be an interested choice for other customers too. Here are the details.

Background

A license server is a service that is used to provide access to a pool of licenses. When a user runs a MOSEK instance the solver asks the license server for a license token: if there is one available, the token is assigned to the requester. MOSEK relies on the FLEXlm license manager, a de-facto standard.

A license server is a lightweight process that can run on any machine that is accessible by the users. Yet it may require
  • a fast and reliable network connection;
  • a reliable machine to minimize down-time.
High-availability can be obtained by using a three-server redundant configuration. Otherwise, a  Disaster Recovery (DR) license can be provided for a backup machine.

Licenses are assigned to a specific machine, identified by a hostid, i.e. a unique code that the license manager generate for that machine (typically the MAC address).

Why

Moving a license server on the cloud can be an interesting choice:

Access a common pool of licenses from different sites.
In this case a license server accessible from the Internet must be set up. Security and availability issues must carefully considered, while network latency is a minor issue.

Hosting a license server on the cloud solves most of the issues: the server is available from everywhere in the Internet; security management from the provider; reliable machines ensure limited or no downtime.

Hosting a DR license

Hopefully, a DR license server is only needed in exceptional cases. Nevertheless it must be kept running and fully working. This involves periodically testing the DR server to ensure its full availability.

A small virtual machine hosting a DR license server can be an alternative: it can be switch on when needed, no worries about its availability. Often it is also payed only for the up-time.

Simplify IT management

No more machine decommissioning: the cloud provider will take care of that and, as we will see, your license server will run machine independently. The server is also nowhere, takes no space,  must never be moved or taken care of. It will need no special room, and will not suffer of any local network or site re-organization.

Why NOT

Moving to the cloud is not always the solution:

  1. If speed is a concern, then the network latency may be unacceptable.
  2. It doesn't solve connection issues from a site to the license server.
  3. Lack of control over the virtual machine.
  4. Three-server redundancy would require three virtual machines.
  5. Yet another service provider to deal with.
  6. Running a license server in-house is usually way much cheaper. 

How

We will briefly describe how to setup a license server on a AMI instance from AWS. Similar steps are usually required for other cloud services.

The main issue is to obtain a hostid for an AMI instance that will not change after every machine reboot. Otherwise you will need a new license file each time.

The FLEXlm license server can use as hostid the Elastic IP (EIP) associated to each AMI instance: an EIP can be manually set by the user so that if the machine needs restart, the EIP can be reset to a given value, and thus the license server will be correctly assigned to the machine. For more information about the use of FLEXlm license manager on cloud machines, please refer to chapter 16 of the license manager manual.

So in a nutshell, deploying a license server on an AMI instance involves these few steps:

  1. Follow the official guide to obtain the EIP
  2. Send us the EIP to obtain the license file specifying the setting.
  3. Install MOSEK on the AMI instance along with the license file.
  4. Setup the clients so that they will look for that host to fetch the license (see here).
  5. In case adjust firewall permission and port listening.
That's it!

Any questions? Contact us!


Sunday, July 20, 2014

License server back online

Dear MOSEK users,
we are pleased to announce that the license server that provides academic and trial licenses is now back online and accessible at


  • http://license.mosek.com/license2/trial/ for trial licenses 
  • http://license.mosek.com/license2/academic for academic licenses


Users that experience any issue connecting to the server are kindly asked to contact us our support team.

We apologize for all inconveniences that the shutdown may have caused.


Andrea Cassioli
MOSEK Product Manager


Friday, July 18, 2014

Trial and academic licenses temporary unavailable

Dear MOSEK users,
due to technical reasons, our licensing system that provides trial and academic licenses is currently unavailable.

Users in need of an evaluation license are asked to contact our support service at support@mosek.com specifying:


  • which kind of license they need (trial/academic),
  • relevant contact information,
  • for academic licenses, a valid email address provided by an academic institution.


We do apologize for the inconvenience and we work hard to get back on-line as soon as possible.


Sincerly,
Andrea Cassioli
MOSEK Product Manager

Monday, June 30, 2014

The licensing system overhead

INTRODUCTION

MOSEK employs a license system to make sure users pay for the software so the developers can get payed.  The way the licensing system works is that a valid license token must be obtained either from a file, or from a token server, i.e. a service/daemon running on a computer on the network.
Clearly, checking out a license token costs some time, but for large scale optimization problems the cost is negligible. However in an application where a large number of small problems are solved in a short amount of time, such checkout overhead can be significant. Here will we try quantify the overhead.

Licensing system at glance

First, let's summarize how MOSEK interacts with the license system:
  1. A license is checked out the first time MOSEK tries to optimize a problem.
  2. The check out involves either reading a file or querying a token server on the network.
  3. By default MOSEK employs license caching, i.e. the token is stored in the so called MOSEK environment. A environment can be reused for all the optimizations.
  4. Therefore, the license is released when the MOSEK environment is destroyed and not when a single optimization terminates.
The license check out overhead is split among the first optimization and the environment termination. Many users, for sake of a cleaner and simpler code keep creating and destroying environments at each optimization. This approach also allows to keep licenses free as much as possible. But if you were to solve thousands of small problems, you might experience a performance penalty.

The test

So the question arises: how much overhead the licensing system introduce?

To answer this question we will test MOSEK using the Python Optimizer API. We will call the solver with an empty problem twice, and measure the time spent for the first and the second call. The test is then repeated 10000 times. We use the standard timeit  Python package. The code for the tests is the following:



We execute the test both in the case of a machine running a token server locally, and in the case of a license file stored locally. The latter corresponds to the case of using a trial license for instance. We obtain the following results:



  1. The first line shows the time spent for creating the environment, the task and then running the solver. The license is release as the environment is destroyed.
  2. The second line reports the time when the environment is created only once and kept the same, while a new task is allocated for each problem.
  3. The last line shows the case in which we only create the environment and the task only once.

It can be seen:
  • There is no time significant difference between using a license file or a token server in an ideal case as the one we are using.
  • If the environment is not reused then average time is 8 milliseconds whereas if it is then the average time is $0.1$ milliseconds. 
  • It takes around $8$ microseconds to check out a license.

Summary


Therefore, if an optimization problem requires less than say $0.1$ second to solve then license check out time is going to be significant if license caching is disabled, i.e. the MOSEK environment is created and destroyed at each solver run.

Tuesday, June 17, 2014

What if the solver stalls?

In this post we will discuss why the solver get sometimes stalls and how does this relates with the solution quality. What does we mean for stall?

The solver get stalled whenever it cannot make any sensible progress towards the optimal solution.

An iteration of a continuous optimizer in MOSEK consists of
\[ x^+ = x + \alpha d_x\]
where
  • $x$:  Is the current guess for the optimal solution.
  • $d_x$: Is a direction computed in some way by the optimizer.
  • $\alpha$: Is nonnegative step size.
  • $x^+$: Is the updated guess for the optimal solution.
The interpretation is that the optimizer computes a search direction $d_x$ and then takes a step in that direction. The search direction is computed such that when moving in that direction the current solution will be moved closer towards an optimal solution. Normally the step size $\alpha$ is chosen such that the new trial solution $x^+$ is as close as possible to the optimal solution with respect to some merit function. 

If the search direction is badly chosen then the optimal step size might be very close to $0$, in which case the algorithm is stalled. So why does this happens in practice? Typically the search direction $d_x$ is given as the solution of a system of linear equations, say


\[H d_x = h,\]


where it can be proven that if the linear system is solved accurately then good progress is obtained. Unfortunately in practice due to fact that computations are done in finite precision then the search direction may be of poor quality leading to a stall.

To summarize due to numerical issues MOSEK may not be able to compute a good search direction and if that leads to an extremely short step then MOSEK stops and says STALLED. Thus it does not make sense to continue because the subsequent iterations will also leads to small steps and no progress at all.

Observe that when a stall occurs then usually the current solution is very close to an optimal solution and many cases the reported solution is good enough for most practical purposes. Recall that usually the optimization problem solved is approximation of the true problem and even in the best case MOSEK will only report an approximated solution to approximated problem. 

Let's visualize the algorithm execution using a simplified scheme: the feasible region (primal and dual) is the yellowish one and the solution is the green dot on its corner. Algorithm execution generates a sequence of iterates, the green dots, that hopefully will end up in 10.

Interior-point algorithm execution sketch.





























For each iterates, a normalized primal and dual feasibility violation is computed, as well as a measure of optimality violation. Let's call them $r_p,r_d$ and $r_g$ respectively. Ideally, $||(r_p,r_d,r_g)||$ should be driven to zero, but in practice the solver stops when
\[ ||(r_p,r_d,r_g)|| \leq \epsilon_f.\]

Thus $\epsilon_f$ defines a region in which a solution is optimal, i.e. the purple region in the figure. When the solver terminates it provides a solution with a certain solution status. Hopefully, it will be optimal. Otherwise, the solver may still be able to provide a useful solution: for this reason, and to improve flexibility, MOSEK denotes some solutions as near-optimal: the solution would be optimal if we scale up $\epsilon_f$ by a factor $\alpha>1$, i.e.

\[ ||(r_p,r_d,r_g)|| \leq \epsilon_f\alpha.\]

This corresponds to the reddish region. As the algorithm proceeds, we reach the near-optimal region at step (9) and then the optimal one at step (10). If all goes well, we stop in the latter. If we get stalled at (9) we will get a near-optimal solution. But what if the solver get stalled at step (8)? The solution status is unknown in this case.

In general, if the solver is stalled, the solution status is still meaningful since the solver is able to check the solution feasibility and the residual values. In particular a solution could be near-optimal. In our example, step (8) is not even near optimal and could be not very useful, but it is still worth to check it.
In case of stall, user should
  • look at solution summary: provides the primal and dual violation under the column PFEAS and DFEAS, and the MU column measure the optimality violation;
  • retrieve the solution status from the solver.
What can be done to avoid stalls? It is hard to say something about in general. A few good rules to follow are:
  • Avoid nearly feasible or nearly infeasible problems.
  • Make sure the problem is scaled reasonably.
  • Make sure the primal and dual solutions are reasonably bounded i.e. not too large in absolute size.
Users should consider revising their model is stalling happens regularly. If modeling does not help, you are encouraged to contact our support.

What NOT to do when an optimization problem is infeasible!

Given your optimization problem is infeasible then what should do?

A common practice is to look at the infeasible solution returned by the optimizer e.g. MOSEK and then relax the bounds that are violated by the reported solution. This practice is rooted in the fact that in the good old days all linear optimization problems was optimized using the primal simplex algorithm employing a phase 1 and phase 2 approach as described in textbooks. The purpose of the phase 1 is to find a feasible solution and then in phase 2 to locate an optimal solution starting from the phase 1 solution. The implication is if no feasible solution exists, then the reported solution will be as close to as feasible solution as possible in a well defined sense. Therefore, it make sense to relax the problem based on the solution reported by the phase 1 because that leads to the smallest possible relaxation that make the problem feasible.

However, nowadays this does not work in most cases for the following reasons:
  1. Most optimizers apply a presolve to the problem before optimizing. Now finding an optimal phase 1 solution to the presolved problem is almost never the same as finding an optimal phase 1 solution to the original problem  if the problem is infeasible. Therefore, if the presolve has been effective the reported solution is likely to be far from an optimal phase 1 solution to the original problem.
  2. Moreover, if the problem is found to be infeasible during the presolve, then the phase 1 problem is not solved at all and hence the optimal solution to the phase 1 problem is not available.  
  3. Interior-point based optimizers does not use a phase 1 and phase 2 approach, Therefore, the primal solution returned by an interior-point optimizer for infeasible problems is usually not very informative and definitely not close to an optimal solution to a phase 1 problem.
  4. The dual simplex algorithm never solves the phase 1 problem. Rather in case of an infeasible problem it produce a Farkas certificate of the infeasibility.  
The main recomodation is if an optimization problem is diagnosed infeasible by an optimizer e.g. MOSEK, then do NOT use the primal solution for anything unless you are absolutely sure it is an optimal solution to a phase 1 problem. 

Then what should you do? Assuming the problem is a continuous linear or convex optimization problem then there is a so called Farkas certificate that proves the problem is infeasible. Some optimizes like MOSEK will report the Farkas certificate as the dual solution. Such a Farkas certificate can be used to repair the problem as discussed in the technical report.  

Alternatively a phase 1 problem can be formed explicitly and solved if you want to do the traditional repair procedure. By the way the optimal dual solution to the phase 1 problem will be a Farkas certificate of the infeasible status. Hence, the phase 1 problem can seen as computing a Farkas certificate if one is available.

Wednesday, June 4, 2014

New Rmosek release available

We are happy to announce that a new version of Rmosek is available.

The new release is now fully compatible with the latest R version 3.1 recently released.

You can download  the package from



Give it a try!

Thursday, May 8, 2014

MATLAB GUI and MOSEK logging: a trick worth to know!

Are you a MATLAB GUI user? Are you solving a large amount of problems? Are you solving small problems? If so, there is trick that can be handy for you to speed up MOSEK!

The usual way to call MOSEK using the MATLAB toolbox is

[r,res] = mosekopt('minimize',prob);

where prob is a structure that holds the problem information. To suppress the solver logging you can set the option MSK_IPAR_LOG to zero. Instead, try to use

[r,res] = mosekopt('minimize echo(0)',prob);

and check the running time: you should see a reduction! This is due to the way MOSEK output is generated in our MATLAB toolbox.

You can add the string "echo(0)" to any command passed on to the solver, i.e.

 [r,res] = mosekopt('any_command echo(0)',prob);


The running time reduction is NOT proportional to the problem size, but roughly to the amount of output. This is why you may gain significantly if you solve:
  1. small problems: because the solver running time might be dominated by the output
  2. a large amount of problems: because the gain will sum up
Note that the trick does not apply to the MATLAB interface of the Fusion API.

Wednesday, April 30, 2014

End of Windows XP support


Windows XP came to an end of its life-cycle on April the 8th. As a consequence, it will be no longer supported by MOSEK as well.

For our user,s this means in practice that:

  • MOSEK will no longer be tested on Windows XP.
  • There will be no more bug-fixings related to Windows XP.
  • Performance tuning will not be carried out under Windows XP.
  • Compatibility of future releases of MOSEK, both major and minor, on Windows XP will not be guarantee.
As for all other issues, customers will receive support whatever platform they run MOSEK on. 


For any questions please contact us through support(at)mosek.com

Thursday, April 10, 2014

Easter time at MOSEK

Dear all,
next week people here at MOSEK will celebrate Easter, a national Holiday here in Denmark. Thus, not all of us will be at work full-time.

The support and sale service will be provided on a reduced basis:

  • From Monday 14/4 to Wednesday 16/4 the service might be reduced but active.
  • From Thursday 17/4 to Monday 21/4 there will be no support available.
All unresolved requests will be processed as soon as possible from Tuesday 22/4.



We wish all of you an Happy Easter.

Wednesday, April 2, 2014

MOSEK version 7.0.0.111 available!

We are happy to announce that the new MOSEK release 7.0.0.111 is now available to download from our download page.

Major Bug Fixings:


  • A bug in the sensitivity analysis.
  • A bug in the .NET SCopt interface.
  • An incorrect assert in the basis identification.
  • Objective cuts not working in the simplex optimizers when maximizing. 
  • A bug causing an assert on some dual infeasible problems.

Performance improvements:


  • Tuned the presolver for some nonlinear problems.
  • Tweaked the primal simplex optimizer.


Please also note we have updated the license agreement. For further details see the complete release note.

Friday, March 28, 2014

SwissQuant Math Challenge on LASSO Regression: the MOSEK way

The Swiss company SwissQuant publishes monthly a math challenge in which they call for people to elaborate and discuss problems of different kind, somehow related to the company business. This month they focus on (quadratic) convex optimization, asking people to proposed specialized algorithm for the OLS-LASSO problem that can beat the established commercial code for quadratic programming as MOSEK. You can find details here Math Challenge March 2014.

Clearly we are glad they appreciate somehow our product, but also we feel challenged in tackling the problem itself. MOSEK is not specialized for OLS-LASSO, but still it can be very efficient we thought, especially in dual form. Moreover, we are always eager to find good test problems, for the core solver but also for our APIs.

The OLS-LASSO problem

So let's start from the formulation given in the challenge, where we are given $T$ samples in dimension $m$:

\begin{equation}\label{main}\tag{OLS-LASSO}
  \begin{aligned}
    \min_{u,b} & \frac{1}{2} || u ||^2 + \sum_{j=1}^{m} \lambda_j |b_j|\\
    &\\
    s.t.&\\
    & u = y - X b
  \end{aligned}
\end{equation}

where $b,\lambda\in \mathbb{R}^m, u,y\in \mathbb{R}^T$. We reformulate problem \ref{main} so to have a linear objective function:

\begin{equation*}
  \begin{aligned}
    \min_{u,b,z,w} &\quad z + \lambda^T w\\
    &\\
    s.t.&\\
    & u + X b = y\\
    & s=1\\
    & (s,z,u) \in Q_R^{T+2}\\
    & w_i \geq |b_i|& \forall i=1,\ldots,m.
  \end{aligned}
\end{equation*}

where $Q_R$ is the rotated cone. There are two way to to get rid of the absolute value:

(1) Linear constraints 

\[ w_i\geq |b_i| \Rightarrow \left\{ \begin{array}{l}  w \geq b \\ w \geq -b \\\end{array}\right.\]

(2) Second-order conic constraint

\[ w_i \geq |b_i| \Leftrightarrow w_i\geq \sqrt{b_i^2} \equiv (w_i,b_i)\in Q^2\]


We will see which primal form works the best. But it's the dual that we want to really see at work. Using the standard procedure outlined in our post about duality, we derive the dual of problem (PQ). It is easier than PL, believe us.

First let's fix a partition of the variable set as $[s,z,u,w,b]$, and clearly states the basic vector and
matrices:


(1) cost vector: $c^T= [0 \quad 1 \quad 0_T \quad \lambda^T \quad 0_m]$;
(2) right-hand side: $b=^T[1 \quad y^T]^T$;
(2) matrix coefficients: $A= \left[ \begin{array}{ccccc} 1 & 0&0_T&0_m&0_m\\ 0&0&I_T& 0_m&X\end{array}\right]$;


We then introduce as many dual variables as primal linear constraints:
$\mu_0$ for the first constraint, and $\mu\in \mathbb{R}^T$ for the
second one. We also include additional slack variables $\gamma$
partitioned in the same way as the primal variables, i.e. $\gamma^T=[\gamma_s,\gamma_z,\gamma_u,\gamma_w,\gamma_b]$.

The dual problem is

\begin{equation*}
  \begin{aligned}
    \max_{\mu_0, \mu, \gamma} & \quad \mu_0 + y^T \mu \\
    &\\
    s.t.&\\
    & c - A^T \left[ \begin{array}{c}\mu_0\\\mu\end{array} \right]= \gamma\\
    & (\gamma_s,\gamma_z,\gamma_u) \in Q_R^{T+2}\\
    & (\gamma_w,\gamma_b) \in \Pi_{i=1}^m Q^2.
  \end{aligned}
\end{equation*}

After some calculations, we obtain:

\begin{equation}\tag{D}
  \begin{aligned}
    \max_{\mu_0,\mu,\gamma_z} &  \quad \mu_0 + y^T \mu \\
    &\\
    s.t.&\\
    &  X^T \mu \leq  \lambda\\
    & \gamma_z=  1\\
    & \left(-\mu_0, \gamma_z, -\mu \right) \in Q_R^{2+T}.
  \end{aligned}
\end{equation}

Try to do the same with PL, you'll get the same model.

The results

We implement these three models using the MOSEK Fusion Python API and run a batch of tests for $m=[10,25,75,100]$ and $T$ from $10000$ to $100000$. To make it easy, $X,\lambda$ and $y$ are all uniformly generated in $[0,1]$. The code that interface with the model looks like


Here you can see the running time for problem PL in green, PQ in red and D in blue. In the first  plot we fix $m$ and let $T$ varies:


and then if we fix $T$ either to $10000$ or $100000$ and we get



Again, the dual is again a clear winner, as in a previous post! No way the primal can even get close to the dual.

But it is also interesting to see how  PQ is as good as PL. Red and green lines are almost indistinguishable. Only for larger $m$, i.e. when the number of cones increases, we can see a significant difference. Otherwise the difference is very very small.




Friday, March 21, 2014

Working with the dual in practice

Recently we have been discussing about the use of dual problem: in our last post  we showed that building the dual can be a fairly automatized process; we also argued that there are at least two good reasons why we may want/need to work with it:
  1. solving the dual may be more efficient (see this example);
  2. the dual variables relate to the reduced coefficients and shadow costs (see this post);
So either we solve the primal and we would like to have the values of the dual variables, or the other way round! In both cases, we boil down to the question (remember the dual of the dual is the primal):

Given the solution of a linear program, how can we recover its dual variables?

We will discuss how to work on the dual solution for linear programs, but most of our arguments will apply to conic programs as well. Remember our primal in standard form:

\begin{equation}\tag{P}
\begin{aligned}
& \min \quad c^T x\\
& s.t.\\
& Ax -b =0\\
& x\geq 0,
\end{aligned}
\end{equation}

and the corresponding dual

\begin{equation}\tag{D}
\begin{aligned}
& \max \quad b^Ty\\
& s.t.\\
& c - A^Ty - s =0\\
& s\geq 0.
\end{aligned}
\end{equation}

For a given optimal primal solution $x^\star$ and dual $y^\star,s^\star$, duality theory tells us:

(1) Strong Duality: the optimal values of (P) and (D) coincide

(2) Complementary Slackness: it must hold  $x_i^\star s_i^\star = 0$ for all $i=1,\ldots,n.$ Thus $s_i=0$ for all $i=1,\ldots,n$ such that $x_i^\star>0$.

(3) Optimal Basis: using the partition of the primal variables induced by the optimal basis, we can define a full rank sub-matrix of the constraints coefficient matrix, and use it to compute $y^\star$...but this goes way to far for a simple blog post, you should refer for instance to Section 3.5 in [1].

If points (1) and (2) are straightforward, point (3) involves some work and attention....should we really spend time on that? The answer is NO! The reason is solvers usually provide some functionality to retrieve the dual variables after the primal have been solved:

1. Primal-dual linear solvers (such as MOSEK) solve the primal and the dual at the same time, and thus provide the dual variables directly.

2. Primal or Dual solvers based on the Simplex method have all the information to recover the dual variables easily basis (see again [1]).

So you just need to query somehow the solver that you use to retrieve the dual values. Let's play with MOSEK and a toy linear programming problem taken from [2] Section 6.2. In LP format the primal is


and its dual is
Note that the two problems are formulated in a natural way. MOSEK  represents linear problems as

\begin{equation}\tag{MP}
\begin{aligned}
& \min c^T x + c_f \\
& s.t. & \\
&& l^c \geq Ax \geq u^c\\
&& l^x \geq x \geq u^x
\end{aligned}
\end{equation}

which is a more general and natural way for the user. After some boring but trivial calculations we obtain the dual problem in the form

\begin{equation}\tag{MD}
\begin{aligned}
& \max (l^c)^T s_l^c - (u^c)^T s_u^c + (l^x)^T s_l^x - (u^x)^T s_u^x\\
& s.t. & \\
&& A^T(s_l^c - s_u^c)  + s_l^x - s_u^x=c\\
&& s_l^x,s_u^x, s^c_l, s_u^c \geq 0
\end{aligned}
\end{equation}

As you may see, there are exactly a variable for each primal constraints. Running the solver on the primal we get
and the solution looks like The columns named as dual represent the corresponding dual variables. Running the solver on the dual we obtain
and the solution looks like
But wait a minute...it seems MOSEK just swap primal and dual!?!?!?

True in some sense! MOSEK implements a primal-dual algorithm that solves both primal and dual at the same time. For small problems there might be no difference at all. Few comments on the solver output:
  1. Feasibility and optimality are achieved at the same time: during the execution the intermediate solutions are neither primal nor dual feasible. This is shown in the columns PFEAS and DFEAS as the norm of the constraint violation.
  2. Column POBJ and DOBJ show the progress towards the optimal value and, as Strong Duality Theorem told us, they are the same once optimality is reached.
  3. Dual variables are reported in the LOWER/UPPER DUAL columns.
  4. To each non active bound corresponds a zero dual variable, as for the Complementary Slackness.
  5. Dual values are zero for each missing bounds: this is because a solver, as MOSEK does, actually sets a very large bound (in absolute value) in order to overcome numerical issues. A missing bound resolves in practice to a non active bound, as in the previous point.

If you want to dig more in the meaning of the dual information, follow Section 6.2 in [2]. For us,  it is important you get the message:

Dual information are readily available from your solver and can be useful!

Most of what we have said is also, and even more, true for second order conic programs: duality holds but to play with it you should ask your preferred solver.

[1] Papadimitriou, C. H., & Steiglitz, K. (1998). Combinatorial optimization: algorithms and complexity. Courier Dover Publications.
[2] Williams, H. P. (1999). Model building in mathematical programming.

Tuesday, March 11, 2014

How to write the dual problem in a nutshell

Duality in linear programming  is often mentioned and sometimes even used (see [2]). But in our experience, many practitioners unfortunately have a quite blurred idea  about the dual problem and the meaning of it. Things are even worse when considering conic programming, despite the fact that most of the theory remains unchanged. There are at least three good reasons to have a good understanding conic duality:
  • The dual problem could be faster to solve (see our post), even if the complexity is the same.
  • Sensitivity analysis uses dual information (see our post), just think about shadow prices.
  • Infeasibility and unboundedness certificates are stated in terms of duality, using the Farkas' Lemma.
Duality theory for conic optimization is rich and well-established and deserves much more space than a blog post (see for instance [1]). As first step, we will show how to easily derive the dual formulation of a given conic optimization problem. For sake of clarity, we will not consider semidefinite optimization problems, even if the conic framework readily extends to them.

Let assume we are given an optimization problem, which we refer as the primal. If we are lucky, the primal can be reformulated as a conic optimization problem by means of some transformations (see [1],[3]). An hopefully small additional effort is usually needed to cast our conic problem in (primal) standard form. Let the primal variables be reordered and partitioned such that  $x=(x_1,x_2,\ldots,x_k)^T\in \mathbb{R}^n$ with $x_i\in \mathbb{R}^{n_i}$. Then the standard primal form is:

\begin{equation}
\begin{aligned}
  &\min \quad c^Tx &&\\
  & s.t. &&\\
  &&Ax - b = 0 &\\
  &&x_i \in \mathit{K_i^{n_i}} &\quad i=1,\ldots\,k
\end{aligned}
\end{equation}

Note how each block variable $x_i$ belongs to a cone $\mathit{K_i^{n_i}}$; moreover, we can  recover the standard LP formulation just setting $k=1$ and $n_1=n$. In short we may write $x\in \Pi_{i=1}^k K_i^{n_i}$. For our purpose, $\mathit{K_i^{n_i}}$ can be

- the positive orthant $\mathbb{R}^{n_i}$
- a Lorentz cone $L_i^{n_i}$ of dimension $n_i$
- a rotated Lorentz cone $R_i^{n_i}$ of dimension $n_i$

We introduce as many dual variables $y$ as cones in Problem (1) and obtain the dual problem as follows:

\begin{equation}
\begin{aligned}
  &\max \quad b^Ty \\
  & s.t. \\
  &&  A^T y -c \in\Pi_{i=1}^k \mathit{K}_i^*\\
\end{aligned}
\end{equation}

where $\mathit{K}_i^*$ is the dual cone of $\mathit{K}_i$. In practice, the cones we work with are all self-dual, i.e. they are all dual to themselves, and this means roughly speaking $\mathit{K}_i^*\equiv \mathit{K}_i$. Introducing slack variables Problem (2) reads:

\begin{equation}
\begin{aligned}
  &\max \quad b^Ty \\
  & s.t. \\
  && c- A^T y + s= 0\\
   &&s_i \in \mathit{K}^*_i  &\quad i=1,\ldots\,k
\end{aligned}
\end{equation}

which can be converted again in standard form by noticing that $\min x = - \max -x$ and transforming free variables in non negative ones (see for instance [2]).

Summarizing, we recommend to follow these steps:
  1. if possible recast the given problem in conic form
  2. transform the conic form in standard form
  3. derive the dual formulation
  4. simplify the dual if possible
In practice, step 1 requires the biggest effort, while the others follow quite directly. Indeed, the crucial point is to certified the given problem can be expressed in conic form, and we do this actually constructing such a representation. Once we get the conic formulation, most of the job is done. So, working in conic form has also the interesting side-effect of a much simpler access to dual formulation and information.

Check this post to see a practical example! Others will follow in the next days.


[1] Ben-Tal, A., & Nemirovski, A. (2001). Lectures on modern convex optimization: analysis, algorithms, and engineering applications (Vol. 2). Siam.
[2] Papadimitriou, C. H., & Steiglitz, K. (1998). Combinatorial optimization: algorithms and complexity. Courier Dover Publications.
[3] MOSEK modeling manual , 2013, Mosek ApS

Thursday, February 27, 2014

Some pitfalls of the sensitivity analysis for LPs

The old, good sensitivity analysis for LPs...It tries to answer a crucial question:

How does the solution of our optimization model react to small perturbations of the input data?

Don't worry, we are not going to review the principles of the sensitivity analysis, one can refer for instance to [1], but we will show that the commonly used techniques can hide unexpected pitfalls.
In general, an LP sensitivity analyzer returns the shadow price of a given input parameter, i.e. broadly speaking the rate of variation of the objective function with respect to the parameter, and the interval in which the that price is constant.

Every OR practitioner should know this tool and its potential pitfalls. But it is not often the case...
here goes the story: one of our customer was puzzled by the results of the sensitivity analysis he was carrying on some LP problems (related to some kind of unit commitment problem): why the shadow prices of the solutions were sometimes different? And why they did not seem to make much sense to him? 

He boiled down to a VERY minimal example for us:

\begin{equation}
\begin{aligned}
& \min & 10^5 x_1 + x_2 \\
\\
&& x_1  + x_2  =100\\
\\
&& x_1,x_2 \in [0,100]
\end{aligned}
\end{equation}

The problem is so simple that can be solved by hand: the optimal solution is $(x_1^\star, x_2^\star)=(0,100)$. It is also degenerate (either the upper or the lower bounds can be dropped) with multiple optimal basis: to see that, let's recast the problem in standard form:

\begin{equation}
\begin{aligned}
& \min & 10^5 x_1 + x_2 \\
\\
&& x_1  + x_2  =100\\
&& x_1  + s_1  =100\\
&& x_2  + s_2  =100\\
\\
&& x_1,x_2,s_1,s_2 \geq 0
\end{aligned}
\end{equation}

The optimal solution now reads $(x_1^\star, x_2^\star, s_1^\star, s_2^\star )=(0,100,100,0)$. It is straightforward to see that there are two optimal basis: $B_1=\{x_2,s_1,s_2\}$ and $B_2=\{x_2,s_1,x_1\}$.

What if we ask for a sensitivity analysis about the linear constraint in problem (1)?  Using the command line MOSEK sensitivity analyser the answer is:


BOUNDS CONSTRAINTS
INDEX    NAME  BOUND    LEFTRANGE        RIGHTRANGE       LEFTSHADOW       RIGHTSHADOW     
0        c0000 FX       -1.000000e+02    -0.000000e+00    1.000000e+00     1.000000e+00    


that is the RHS can be reduced by up to $100$ (the LEFTRANGE) with unitary shadow price.  But we have no information about what might happen if we increase the RHS. Let's  visualize what we would like to investigate (RHS is renamed as $\beta$):

The gray area represent the region defined by the box constraints. The red line is the feasible region of problem (1) and the optimal value is the red dot on the left-up corner. Varying RHS results in a translation of that feasible region and a different optimal value. The yellow lines/points correspond to the feasible region/optimal solution as RHS decreases; the light-blue ones, corresponding to an increase of RHS,  are not reported by the analyzer. But still, it is clearly possible to increase RHS...so what's going on?

This problem is quite well-known in the OR community. In [3], the main issues and pitfall are clearly summarized: the standard analysis, named Type-I, depends on the optimal basis. If more than one basis corresponds to an optimal solution, the analysis will be in general incomplete because limited by the changing of basis.

In our case, the solutions  obtained decreasing RHS share the same basis, while a different basis corresponds to the other ones. So the scope of our analysis is limited by the basis reported by the solver.

A more accurate analysis can be carried out using more computational intensive techniques, sometimes referred as Type-III sensitivity or Optimal Partition Sensitivity (OPS): simply speaking, a sequence of LP problems based on the current optimal solution are solved in order to asses the sensitivity, regardless of the particular basis that might correspond to the solution (see [2]).

Again, let's look at our example. We use the OPS tool available in MOSEK to analyze again the same constraint. The result follows:

BOUNDS CONSTRAINTS
INDEX    NAME  BOUND    LEFTRANGE        RIGHTRANGE       LEFTSHADOW       RIGHTSHADOW     
0        c0000 FX       -1.000000e+02    1.000000e+02     1.000000e+00     1.000000e+05    

Now we can investigate also the increasing of the RHS! As one might expect, a different shadow price correspond to this side of the RHS interval.

The lesson is that in most of the practical applications, optimal solutions are degenerate and thus the Type-I sensitivity analysis might be incomplete. In general, there is no way to predict which basis will be returned by a solver. The OPS analysis, despite being more computational demanding, can deliver more accurate information and it is therefore a tool that practitioners should consider and learn.

[1] Chvátal, V. (1983). Linear programming. 
[2] Roos C., Terlaky T., Vial J. P. (1997). Theory and algorithms for linear optimization: an interior point approach. Chichester: Wiley.
[3] Koltai T.,  Terlaky T. (2000). The difference between the managerial and mathematical interpretation of sensitivity analysis results in linear programming. International Journal of Production Economics, 65(3), 257-274.



Thursday, February 20, 2014

Conic duality in practice: when dual beats the primal...


Despite its simplicity, the problem we have been discussing in the last two posts (see the first and second post) is a nice source of examples and insights about conic programming. The formulation we obtained is:

\begin{equation}
\begin{aligned}
\min r_0&\\
 s.t.&\\
 &\quad ( r_0\mathbf{e}, P-\mathbf{e}p_0^T) \in \Pi_{i=1}^k Q^{(n+1)}.
\end{aligned}
\end{equation}

Recall that $p_0,p_i$ are row vectors and $e$ is a vector with all ones. We might wonder whether the dual problem of (1) is harder or easier to solve. Using Conic Programming duality this can be done pretty easily.

So let's use the duality theory of Conic Programming to derive the dual problem of (1).  First, to derive the dual we restate (1) in a slightly different way:

\begin{equation}
\begin{aligned}
\min r_0&\\
 s.t.&\\
 &\quad ( r_0, p_0^T - p_i) \in Q^{(n+1)} & i=1,\ldots,k.
\end{aligned}
\end{equation}

Following [1], this is our primal formulation and the construction of the dual problem should be quite straightforward! Consider the standard primal form (see pp. 69 in [1]):

\begin{equation}
\begin{aligned}
\min c^T x&\\
 s.t.&\\
 &\quad A_i x_i -b_i \in K_i & i=1,\ldots,k,
\end{aligned}
\end{equation}

where $K_i$'s are cones and $x_i=(r_o , p_0)^T$: problem (2) follows by setting $ c=( 1 , 0_{n})^T$ and

\[
A_i= I_{n+1},\quad b_i= \begin{pmatrix} 0 \\ p_i^T \end{pmatrix},\quad K_i=Q^{(n+1)} , \quad i=1,\ldots,k..
\]

To formulate the dual,  for each cone $i$ we introduce a vector $y_i \in R^{n+1}$ of dual variables such that $y_i$ must belong to the self-dual cone of $Q^{(n+1)}$, i.e. $Q^{(n+1)}$ itself. The dual problem takes the form

\begin{equation}
\begin{aligned}
\max & \sum_{i=1}^k b_i^T y_i\\
 s.t.&\\
 &\sum_{i=1}^k y_i = c\\
 &y_i\in Q^{(n+1)} & i=1,\ldots,k.
\end{aligned}
\end{equation}

which is equivalent to

\begin{equation}
\begin{aligned}
\max & \mathrm{Tr} \sum_{i=1}^k B Y^T \\
 s.t.&\\
 & e_k^T Y = c\\
 &Y \in \Pi_{i=1}^k Q^{(n+1)}
\end{aligned}
\end{equation}

where \[ Y=  \begin{pmatrix} y_1^T \\ \vdots \\ y_k^T \end{pmatrix}, B=  \begin{pmatrix} p_1^T \\ \vdots \\ p_k^T \end{pmatrix}, e_k= \begin{pmatrix} 1\\ \vdots \\ 1\end{pmatrix}\in R^k.\]

Let's implement the dual formulation (5) using the MOSEK Fusion API. The code looks like

The code is very simple, compact and self-commenting. Note how we also exploit the fact that the first column of $B$ is composed by zeros. What about performance? The dual beats the primal in every sense! Look at the plot below (tests have been done on my desktop machine Pentium(R) Dual-Core  CPU E5700  @ 3.00GHz under Linux) where we report the running time needed to build the optimization model and to solve it.



The dual is much faster both in terms of time spent in building the model, which is indeed much simpler, and also in terms of solver execution time! Both model building scale almost linearly, but also the solution time does. If the former is expected, the latter might come as a surprise! Note however, that model building time matters when tackling polynomially solvable problems....

Final take-away message:

- Solving the dual instead of the primal can be much more efficient!

- The conic formulation helps to derive the dual problem!


[1] Ben-Tal, A., & Nemirovski, A. (2001). Lectures on modern convex optimization: analysis, algorithms, and engineering applications (Vol. 2). Siam.

Monday, February 17, 2014

Smallest sphere containing points using Fusion

In the previous post we formulated the fairly intuitive problem
How to find the smallest sphere that encloses a set of $k$ points $p_i \in R^n$.
as a second order conic problem:

\begin{equation}
  \begin{aligned}
\min r_0&&&\\
s.t.&&&\\
& r_i = r_0,&\quad& i=1,\ldots,k\\
& w_i = p_i - p_0,&\quad& i=1,\ldots,k\\
& (r_i,w_i) \in Q^{(n+1)}, &\quad& i=1,\ldots,k.
\end{aligned}
\end{equation}


We will show in this post how the MOSEK Fusion API for conic optimization allows for a very compact prototyping. Let's start!

First, we assume that the points are represented as row vectors, i.e. $p_0=(p_0^1,\ldots,p_0^n)$. The given points are organized in a matrix $P$ by row, i.e. $P\in R^{k\times n}$

We reformulate the problem using a different, yet equivalent, formulation using the fact, see [1], that an affine transformation preserve the conic representability of a set, and thus we can write:

\begin{equation}
r= r_0\mathbf{e}, \quad w = P - \mathbf{e}p_0.
\end{equation}

being $ \mathbf{e}\in R^n$ a vector with all components equal to one. The formulation we are looking for is

\begin{equation}
\begin{aligned}
\min r_0&\\
 s.t.&\\
 &\quad ( r_0\mathbf{e}, P-\mathbf{e}p_0^T) \in \Pi_{i=1}^k Q_k^{(n+1)}.
\end{aligned}
\end{equation}

Now let's input the model using the Python Fusion API, but for the other supported languages it would look like much the same. Assuming the given points are stored in a matrix p, the following few steps suffices (click here for the whole code):

1- create an optimization model

2- declare the variables (note $p_0$ is a row vector)

3- define the constraints in one shot (here np is shorthand for numpy)


 Note how we "stack" columns side by side obtaining a $k\times (n+1) $ matrix of expressions, which will be interpreted line by line to obtain the $k$ ice-cream cones.

4- define the objective function


5- solve !


Does it work? Of course it does....here a solution for $n=1000000, k=2$.


Pretty neat right? Where is the magic and what are the pros and cons?

The magic is the ability of Fusion to convert the expressions in standard conic constraints, i.e. the first model we formulate: all the additional variables will be generated automatically by Fusion!

How does it look like the same code with the standard Pyhton API? Just take a look at the constraint declaration. First we need some offset in the variable set:
Then a loop to create all the linear and conic constraints:



The code is still manageable one might say, but it is partially because Python allows for very compact notation. But the point is: could you easily deduce the model implemented by the code? And moreover, what if an offset goes wrong? To you the answers....

Fusion vs. Standard API

Pros:
- a cleaner and shorter code
- easier for the solver to analyze the structure of the model
- avoid the use of indexes and offset to input the problem data
- closer to a modeling language

Cons:
- require more modeling skills
- slower execution during the creation of the model, not the solver execution, at least for large models

We believe usually the pros exceed the cons, especially in the long run when users get used to the syntax and modeling philosophy.

Give Fusion a try!

[1] Ben-Tal, A., & Nemirovski, A. (2001). Lectures on modern convex optimization: analysis, algorithms, and engineering applications (Vol. 2). Siam.

Friday, February 14, 2014

A nice example of conic optimization modeling

Few days ago, my boss Erling Andersen suggested me to look at this problem to learn the MOSEK Fusion API:
How to find the smallest sphere that encloses a set of $k$ points $p_i \in R^n$.
The intuitive formulation is to minimize the radius r of the sphere centered in $p_0 \in R^n$, i.e.

\begin{align}
& \min r \\
& s.t. \\
& & || p_i - p_0||_2 \leq r, & i=1,\ldots,k
\end{align}

The inspiration came from a recent book¹ where the problem, probably taken from the GAMS repository², is presented as a difficult test for nonlinear local optimizers.

At the same time (what a coincidence!), O'Neil and Puget published interesting post about the Chebyshev centers of polygons...which turns out to be equivalent to what I was looking at! As Puget pointed out, the problem is known to be simple: Megiddo³ showed back in 1982 that it can be solved in linear time by a specialized algorithm.

Anyway, we will use it as an example of conic programming modeling example. First we introduce auxiliary variables such that  $w_i = p_i - p_0$, yielding

\begin{align}
& \min r \\
& s.t.\\
& & || w_i ||_2 \leq r, & i=1,\ldots,k\\
& & w_i = p_i - p_0, & i=1,\ldots,k
\end{align}

Constraints (6) define quadratic (Lorentz) cones $Q^{(n+1)}$, i.e.

\begin{equation} || w_i ||_2   \leq r    \Rightarrow (r, w_i) \in  Q^{(k+1)}.\end{equation}

For the second order cones to be disjointed, we introduce auxiliary variables $r_i, i=1,\ldots,n$ that must be equal to $r$. The final formulation of our problem is

\begin{align}
& \min r\\
& s.t.\\
& & r_i = r,  & i=1,\ldots,k\\
& & w_i = p_i - p_0, & i=1,\ldots,k\\
& & (r_i,w_i) \in  Q^{(n+1)}, & i=1,\ldots,k
\end{align}

which is a fairly simple conic quadratic optimization problem!

In the next post we will show how to easily implement this problem using the Mosek Fusion API.

¹ Neculai, Andrei, "Nonlinear Optimization Application Using the GAMS Technology", Springer 2013
² http://www.gams.com/testlib/libhtml/circlen.htm
³ Megiddo, Nimrod. "Linear-time algorithms for linear programming in R3 and related problems." Foundations of Computer Science, 1982. SFCS'08. 23rd Annual Symposium on. IEEE, 1982.

New repository for Rmosek interface available!

As some Rmosek users might know, we have been experiencing technical issues with the CRAN repository that prevent us from updating the Rmosek package.

We have decided to host the package repository at  mosek.com, which will make it easy for users to install the latest release simply from the R command line. Follow us on the dedicated website at



for more updates and details!

Monday, February 10, 2014

A New Product Manager at Mosek

Dear all,
my name is Andrea Cassioli, and I am glad to announce that from February the 1st, I am the new Mosek Product Manager.

Let me introduce myself briefly: my background is a blend of computer science and operational research. I got a PhD. in Computer Science at the University of Florence (IT) working on heuristic methods for difficult global optimization problems. Then I worked as Post-Doc researcher at the University of Florence (IT), the Massachusetts General Hospital (US) and lastly at the Ecole Polytechnique (FR). My research topics has been always focused on solving optimization problems from real-life applications, as for instance machine-learning, urban traffic equilibrium, radiotherapy planning, structural biology. I enjoy coding in several programming languages (C/C++, Python, Java).

I'd like to thank the Mosek CEO E. Andersen for choosing me for this position and all the people  at Mosek for the help they are providing me in this initial period.

I look forward to work with all of you and give my contribute to the grow of the company!

Best,
Andrea Cassioli







Wednesday, February 5, 2014

The MOSEK/Julia interface is now official

At MOSEK we have for some time been aware of the new language called "Julia" (http://julialang.org), but have not really had an excuse to play with it. Then, when we attended the INFORMS conference in Minneapolis last year, we met a couple of guys working on a mathematical modeling extension for Julia (https://github.com/JuliaOpt/JuMP.jl). Fast forward one day and the first MOSEK/Julia interface prototype had been created (first two prototypes, actually - one by me and one by the Julia guys), and from there it just rolled. Now, a couple of months on, we have an almost complete port of the MOSEK solver API and an implementation of the MathProgBase interface allowing LP, QP, SOCP and mixed integer problems to be solved from JuMP using MOSEK.

The JuMP guys from INFORMS 2013 have provided invaluable help and feedback, so many thanks to Miles Lubin and Iain Dunning!

While at first sight there is nothing special about the Julia language, a bit further use quickly shows that it is not just another Matlab-but-done-right or LISP-with-a-new-syntax (I've seen that comment more than once). Or may it is - only actually done right and with a new pleasant syntax! I don't think I can point out any new and innovative features, only the tries and tested old ones combined in a sensible way... You could even get the impression that the language designers knew what they were doing! There are still some rough edges (start-up times, for example), though you can easily forgive that in a version 0.2.0 that is more usable than most version 1.0s.

Friday, January 3, 2014

MOSEK version 7.0.0.101

MOSEK version 7.0.0.0.101 is available for download. The following issues have been fixed:
  • Fixed a bug in the presolve of the mixed-integer optimizer.
  • Fixed a potential memory leak in the Python parts of MOSEK.
  • Fixed a bug in putarow and putacol.
  • Fixed a bug in the presolve potentially affecting all problem types.
  • Fixed a bug in the mixed integer optimizer.
  • Fixed an issue in Fusion.
  • Fixed some typos in the manuals.
  • Add missing MATLAB Fusion examples to the distribution.
  • Ignored the value of ordering method parameter in certain cases.